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Foreword 


To + h?e’ JOTit u~<£ t retiTp L^ui^a^piiys- 

icist and a mathematician to write an 
entirely new type of book for future sci- 
entists and engineers. This calls for a 
few words about its purpose and scien- 
tific ideology. 

Many physicists, as is known, are 
displeased with the presentation of the 
basics of mathematical analysis that 
was introduced by mathematicians in 
the first half of the 20th century. The 
reasons are understandable. The devel- 
opment of mathematical research con- 
nected with the elaboration of the log- 
ical foundations of our science left 
its imprint on the style of presentation 
even of the very first chapters of text- 
books for beginners. Exact definitions 
of such concepts as a real number, lim- 
it, and continuity were the result of a 
prolonged and very nontrivial logical 
analysis of theories that were already 
created and were intuitively clear (on 
the scientific level of rigor). These de- 
finitions, which are not at all simple for 
the beginner, came to be used in the 
wrong context. Textbooks presented 
them before any explanation was given 
of the theory and its applications, there- 
by complicating an understanding of 
things that were intuitively clear. 

It must be borne in mind that theo- 
retical physicists make extensive use 
in their research of all, or nearly all, 
the methods of classical mathematics. 


Thsey "xSlten ran ~erf±i^roy 

the most modern ideas that have sprung 
from the depths of abstract mathema- 
tics or even to devise their own new 
mathematical methods. But the ulti- 
mate result of the investigations of a 
physicist must always be effective. It 
must be expressed as a number or for- 
mula relating to the quantities ob- 
served. The physicist feels it is unimpor- 
tant to justify and establish the logi- 
cal structure of the foundations: he has 
faith in mathematics and does not doubt 
that its methods contain no internal 
contradictions. Theoretical physicists 
are not satisfied with the existing text- 
books and popular scientific literature 
on mathematics written by mathemati- 
cians themselves (especially by those 
who are far removed from physics) 
they feel it necessary to make known to 
the beginner their own concepts of how 
a scientist should use the mathematical 
apparatus and in what simple way it is 
possible to grasp, “to a first approxima- 
tion,” the methods he will have need 
of. It seems to me that their tremendous 
mathematical experience gives them 
this right. 

Of course, such a view of mathemat- 
ics is somewhat one-sided, but surely 
there is no harm in that. Those who 
go on deeper into mathematics will be 
able better to grasp the meaning of the 
foundations of mathematics since they 
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will already have an intuitive under- 
standing of the essence of the matter 
and will have developed the techniques 
of problem solving. As I see it, many 
courses in mathematics are spoiled by 
the desire to start the teaching of a 
new theory by confronting the student 
with the foundations and the formal 
language. As a mathematician who has 
worked with physicists for many years, 
I take the liberty to express the opinion 
that it would be advisable not only for 
future physicists and engineers but also 
for future professional mathematicians 
(and their teachers) to learn to under- 
stand better the style of mathematical 
thinking of the natural scientist and to 
grasp the problems that confront him. 

This text is the result of a great effort 
to improve the book Higher Mathemat- 
ics for Beginners and Its Application to 
Physics by the well-known Soviet phys- 
icist Academician Ya. B. Zeldovich. 
The purpose of this book is to enable 
the future physicist (chemist, engineer, 
etc.) to use higher mathematics in his 


or her work by mastering its methods 
without going into a full logical sub- 
stantiation of them, allowing the stu- 
dent to view mathematics as a section 
of natural sciences and to solve as many 
concrete problems as possible. 

The previous book had a number of 
technical drawbacks. Much has been 
done in this book to eliminate the short- 
comings and improve the presentation. 
Considerable new material has been 
added, and the text has been carefully 
edited. All this work was done jointly 
by the two authors: the physicist Acad- 
emician Ya. B. Zeldovich and the 
mathematician I. M. Yaglom, Doctor 
of Physico-Mathematical Sciences. 

It seems to me that the book achieves 
its objective. I would confidently 
recommend it to the thinking young 
reader who wishes to master the methods 
of higher mathematics informally and 
to apply them to problems in physics 
and technology. 

Academician S. P. Novikov 



Note to the Reader 


There are probably not many people 
who have never heard of Pushkin or 
Leo Tolstoy. But very many grown-up 
people, one-half or even nine-tenths, 
would not be able to explain what a 
derivative or an integral is. Yet without 
these concepts it is impossible to de- 
scribe or investigate the variables or 
functions that characterize the depen- 
dence of quantities on one another. The 
laws of nature are formulated in the 
language of higher mathematics, the 
language of derivatives and integrals. 

It’s our view, though we are not im- 
partial, that higher mathematics is as 
beautiful as the poetry of Pushkin and 
as profound as the prose of Dostoevsky 
or Tolstoy. But is higher mathematics 
seen by the general public as a cultural 
asset? Has it really entered the public 
consciousness as such? Now that mo- 
dern science has greatly extended the 
boundaries of the visible world by dis- 
covering the secrets of the structure of 
the atom and the many mysteries of 
the starry sky, the concepts of deriva- 
tive and integral have become necessa- 
ry elements of general culture. And in 
everyday life, too, it is quite useful 
to understand the rate of change of a 
quantity (the derivative) and the over- 
all effect of the action of a factor (the 
integral). This expands one’s outlook 
and can be used in a great variety of 
situations. 


The traditional methods of teaching 
higher mathematics have complicated 
matters. It would seem simple to under- 
stand “Crime and Punishment” or 
“War and Peace” and to grasp the spher- 
icity of the earth or the atomic structure 
of matter, yet difficult to master 
differentiation or integration. Students 
of the first half of the 20th century were 
terrified by the theory of limits and the 
language of infinitesimals, which is so 
unlike ordinary arithmetic and school 
algebra. When higher mathematics was 
introduced in the traditional way in the 
first year of study at institutions of high- 
er learning, it proved to be the main 
subject that caused students to drop 
out. The only way to change the atti- 
tude of the student toward this subject is 
to change the way it is taught. Higher 
mathematics must be transformed from 
a dry and. difficult subject into a set 
of clear and natural concepts that open 
the way to the study of physics, chemi- 
stry, and engineering disciplines. 

The first problem confronting the au- 
thors of this book was to give the stu- 
dent an understandable introduction to 
higher mathematics unencumbered by 
an unwieldy apparatus or logical sub- 
tleties. We believe that anyone acquaint- 
ed with the basics of arithmetic and 
who had above-average marks in algeb- 
ra in secondary school will find it easy 
(and, we hope, interesting) to read this 



8 


Note to the Reader 


book. We seek for a friendly student who 
has no doubts but believes and wants to 
take up this book with a view to learn- 
ing something new, undeterred by the 
fact that at times the authors offer 
simple examples in place of “very sien- 
tific” theory. 

The book contains many concrete 
examples (perhaps even too many for 
some). There are calculations and prob- 
lems connected with natural phenom- 
ena. We hope that these examples will 
make. iL easier, tor. t ba sindnoT to. bis, 
studies. If we had expected the reader to 
be given to arguing, we would have writ- 
ten it in a different way. Perhaps for 
such a reader the standard presentation 
with all the usual reservations would 
have been more suitable, but we have 
no wish to lose a hundred possible 
friends just because of a single one who 
likes to argue. 

This book is intended for beginners, 
that is, for high-school students in 
the upper grades, students of trade 
schools and vocational schools, and stu- 
dents in the first years of college. We 
also have in mind anyone who by him- 
self wishes to become better acquainted 
with higher mathematics, say, people 
who finished school some time ago. 

The place of mathematics in life and 
in science is determined by the fact 
that it permits one to translate an eve- 
ryday intuitive approach to reality based 
on purely qualitative and hence ap- 
proximate descriptions into the lan- 
p^i-ayp. rvLoxart t ions^jxcL formulae 

from which quantitative conclusions 
can be drawn. So true it is that the 
scientific level of any discipline is de- 
termined by the extent to which it uses 
mathematics. But the real language of 
science is not at all elementary algebra 
or geometry. Higher mathematics 
plays a far greater role. It studies varia- 
bles and processes for which algebra or 
geometry do not suffice. It also inves- 
tigates other more complicated divisions 
of mathematics which are only touched 
on in this book. It is no accident that the 
development of differential and integral 
calculus by Newton was intimately 


bound up with his development of the 
foundations of theoretical (we can also 
say mathematical) physics. He saw 
the concepts of higher mathematics as 
a (literal!) translation of the basic con- 
cepts of mechanics into mathematical 
language. 1 

The first chapters of this book give 
an idea both of the essence of higher 
mathematics and of its possible appli- 
cations. So we can imagine a reader who 
wishes to restrict himself to a selective 
stjuiy ai Chastens, 1 fsi 7- Hmt It was, mat 
for him that we wrote this book. To 
grasp the basics of higher mathematics 
it is necessary to have a good under- 
standing of how this apparatus is used. 
Knowledge not used is not real knowl- 
edge. It is grasped with difficulty and 
easily forgotten. This book is intended 
for people especially interested in phys- 
ics and engineering. For this reason 
the applications are largely taken from 
these areas. It would be possible today 
to compile a textbook of higher mathe- 
matics for a biologist (or future biolo- 
gist), geologist, or economist, but that 
would be an entirely different book and 
other people must write it. 

The second part of this book deals 
with a number of topics from physics 
and engineering. In these areas the pow- 
er of higher mathematics shows itself 
to the full. A great many highly impor- 
tant physical phenomena can be de- 
scribed with a degree of completeness that 
is unattainable without the use of a de- 
ri y aJtiy uiteprAL. By, way, of .illu- 

stration we can mention the theory of 
radioactivity (Chapter 8). Another ex- 
ample is the phenomenon of resonance , 
which in this book appears in two differ- 
ent forms, as one of the effects appear- 


1 It is also no accident that the other in- 
ventor of higher mathematics, Leibniz, worked 
out evolutionary conceptions that were 
closely linked to his mathematical ideas and 
were considerably ahead of their time. These 
conceptions touched on biology, geology, and 
physics; for instance, Leibniz conceived of 
the concept of energy, a quantity that trans- 
forms from one form to another but never vani- 
shes in physical phenomena. 
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ing in the theory of mechanical oscil- 
lations (Chapter 10), and as one of the 
properties of electric circuits (Chapter 
13). Although some of the sections of 
physics are elaborated in Part 2 in great 
detail and others are only outlined or 
even omitted altogether, the amount of 
information about physics presented 
here is so great that as a whole the book 
can be viewed not only as a text in 
mathematics but also as an important 
addition to the ordinary physics 
textbook. 

Chapters 8 to 13 (Part 2 of the book) 
are devoted to physical problems and 
are almost independent of each other. 
They can be read in any order. The only 
exception is Chapter 10, which is de- 
voted to oscillations. To understand 
this chapter it is necessary to read Sec- 
tions 9.1 to 9.6 of the chapter devoted 
to mechanics. Although Part 1 of the 
book, called “Elements of Higher Math- 
ematics,” contains many examples 
from physics and mechanics, various 
sections of Part 2, called “Higher Math 
Applied to Problems in Physics and 
Engineering,” will be of interest to all 
categories of readers. 

The large size of this book would 
seem to contradict the word “beginners” 
in the title. But don’t be afraid of the 
number of pages. We are well aware of 
all the shortcomings (and also of the 
merits) of the detailed and frankly rath- 
er wordy system of presentation that 
we have chosen. The book contains a 
great many examples and calculations. 
Some problems are elaborated several 
times and from different angles. Much 
space is devoted to the history of the 
theories presented. All the information 
contained in this book could, of course, 
be compressed and presented on far 
fewer pages. But then it would require 
a greater effort to read the book, and 
anything omitted or incorrectly under- 
stood would hamper further progress 
(or even make it impossible to proceed). 
Of course, there are situations in life 
when the best source of information is 
a brief telegram, but there is also charm 
in the art of unhurried presentation 
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with many seemingly unnecessary de- 
tails, which was so popular in the 18th 
and 19th centuries and which has al- 
most been forgotten. We think the tele- 
graph style is not at all suitable for 
the beginner. The extra information, 
which in some cases can be ignored, is 
not burdensome, whereas insufficient in- 
formation or its extreme concentration 
in a book for the inexperienced reader 
is quite impermissible. 

It is also necessary to bear in mind 
that this book contains considerable 
material for persons interested in an 
in-depth study of the basics of higher 
mathematics. During the first reading- 
much of this material can be ignored. 
Say, all the small print, including the 
story about the origin of higher mathe- 
matics, and the starred sections. The 
same goes for the entire third part of 
the book (“Some Additional Topics”) 
and the Conclusion (“What Next?”). 
And, finally, we have already empha- 
sized the possibility of a selective ac- 
quaintance with the second (physics) part 
of the book (Chapters 8 to 13). 

The large first chapter (“Functions 
and Graphs”) mainly contains material 
familiar from school. A quick run- 
through should suffice for most readers. 
Chapters 2 and 3 present the founda- 
tion ideas of higher mathematics (diffe- 
rential and integral calculus), while 
Chapters 4 and 5 elaborate the basic 
techniques of higher mathematics suffi- 
cient for elementary applications. Chap- 
ter 6 is devoted to certain more ad- 
vanced areas of mathematical analysis 
(series and differential equations), and 
Chapter 7 is concerned with purely 
mathematical applications of the con- 
cepts of a derivative and an integral 
(physical applications are considered 
in Part 2 of the book). Elements of the 
history of higher mathematics are found 
in the concluding sections of Chapters 
3 and 6. These are separated from the 
main text by three asterisks. And 
finally the Conclusion (“What Next?”), 
which discusses some of the newer divi- 
sions of mathematical science now 
widely used in applications. 
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The third part of the book (notably 
the Conclusion) is of a slightly differ- 
ent character than the first and sec- 
ond. This last part is devoted to two 
trends in higher mathematics that con- 
siderably expand (and generalize) the 
classical differential and integral cal- 
culus of Newton and Leibniz. The first 
is complex numbers and functions of a 
complex variable , which have long 
played a major role in physics and engi- 
neering. It is indeed remarkable that 
many facts connected with ordinary 
real functions of real arguments become 
clearer when we move into the com- 
plex domain. The reader will find some 
examples of this in Chapters 14, 15, 
and 17. The second trend involves 
generalized functions (or distributions) 
/ (#), and first of all the Dirac delta func- 
tion, which is always equal to zero 
♦except when x equals zero, in which 
case it is equal to infinity (!). Such func- 
tions also play a big role in physics 
and engineering. In fact, their origin 
is much more closely connected with 
physics than with mathematics. We 
^consider it most desirable that future 
physicists and engineers become ac- 
quainted with these remarkable func- 
tions as early as possible. 

Chapter 14 to 17, which are devoted 
to functions of a complex variable and 
to generalized functions, are intended 
for the especially interested reader. 
They are written in more of an outline 
:form than the rest of the book (al- 
though they can be understood by a suf- 
ficiently persevering beginner). They 
'can be read in greater or lesser detail, 
as one wishes. 

The Conclusion (“What Next?”) is 
an even more general survey. We do 
not expect the reader to grasp the ma- 
terial fully. It is only intended to arouse 
the reader’s curiosity and encourage 
further work. It would be naive to 
think that the topics embraced by this 
hook can complete the mathematical 
education of a physicist or engineer. 
In the Conclusion we touch on certain 
other trends in science that found wide 
application in physics and technology 


only in the second half of this century. 2 
Although the Conclusion and, in fact, 
the entire third part of the book can 
easily be omitted (it is not needed for 
an understanding of the main part of 
the book), we believe that most readers 
will wish, at least cursorily, to familia- 
rize themselves with these, since they 
reveal prospects for the future and ex- 
plain certain ideas that are important 
to and characteristic of modern science. 

In short, this book can be used in a 
variety of ways. The reader interested 
largely in mathematics will naturally 
devote most of his attention to Part 1. 
He will undoubtedly find it useful, how- 
ever, to familiarize himself with var- 
ious sections of Part 2, say, Chapter 
8 or the first portions of Chapter 9 and 
some of Chapter 10, or with Chapter 
13. Anyone who wishes to deepen his 
knowledge in mathematics is advised 
to turn to the concluding sections of 
Chapter 10 and/or the third part of the 
book. The reader interested in the func- 
tions of a complex variable and the del- 
ta functions should not ignore Chapter 
17, which is devoted to the applications 
of the corresponding theories. Addition- 
al information about mathematics can 
be derived from the Conclusion, but 
for an in-depth grasp of the topics 
tuched on in the Conclusion it is necessary 
to turn to other literature. Finally, a 
reader interested in physics more than 
in mathematics need only run through 
the first part of the book, ignoring the 
text in small print and omitting most of 
the starred sections (but not Section 
6.7, which shows how differential equa- 
tions can be used to analyze phenomena 
in the natural sciences). Such a reader 
need not dwell long on Chapter 7, 
which is purely mathematical, but he 
should give careful study to Chapter 6 
because its applied significance is great. 


2 Here we draw particular attention to 
the theory of groups, which mathematicians 
developed in the 19th century and which 
at that time was considered far removed 
from any physics. At present the physics of 
elementary particles owes its achievements 
largely to the theory of groups. 
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The reader interested in physics will 
probably make a careful study of Part 
2 (and perhaps also Part 3, which is 
likewise connected with profound physi- 
cal theories). He can study Chapters 8 
to 13 in any order, though in some re- 
spects it is best not to allow for too big 
a break in time between Chapter 10 
and Chapter 13. 

The list of literature given at the 
end of the book includes certain other 
publications that outline the basics of 
higher mathematics, also textbooks that 
cover more material, and books that 
deal with topics that go far beyond the 
material presented here. As a natural 
continuation of this book we recom- 
mend [151. 3 On the other hand, the 
books [19], [21] are far removed from 
the main content of this book: they are 
devoted to branches in mathematical 
science that relate to trends mentioned 
in the Conclusion. 

In the literature on physics we espe- 
cially recommend an excellent three-vol- 
ume course [341 (there are also prob- 
lems and exercises relating to the whole 
course). In books [22-30, 37] higher 
mathematics is not used. 

In Appendices 1 to 4 at the end of 
the book the reader will find short ta- 
bles of derivatives, integrals, and numer- 
ical series (which are discussed in Part 
1) and also extracts from numerical 
tables, which, in a sense, round out the 

3 Students of technical colleges who have 
studied our book in the first year are advised, 
in the second year, to turn to [15]. 


book. They will save the reader the 
trouble of turning to other textbooks 
when doing the exercises that complete 
each section of the book. The physics 
examples are based on the now accepted 
international system of units (SI). The 
reader insufficiently acquainted with 
this system will find Appendix 5 help- 
ful. Finally, the concluding Appendix 
6 presents the Greek alphabet, which is 
widely used in this book. 

Like any book on mathematics, this 
one presupposes an active reader (one 
with pencil and paper) willing to work 
the many exercises that are given here. 
Better still if you have a pocket cal- 
culator or a programmable calculator. 
Then you will be able to verify and sub- 
stantiate the general theorems with cal- 
culations and make freer use of the 
method of “arithmetical experimenta- 
tion,” that is, approximate computa- 
tions that reveal the content of the con- 
cepts introduced and the meaning of 
the formulas. The many specific exam- 
ples will help you to evaluate the 
exactitude of the approximate formu- 
las scattered throughout the book, and 
you will better appreciate the wisdom 
of our forerunners who created methods 
and concepts without modern compu- 
ter techniques. At the same time you 
will feel the new power which we chil- 
dren of the 20th century derive from 
computers. 

Ya. B. Zeldovich 
F. M . Yaglom 
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This book is intended not only for stu- 
dents but also for teachers who, while 
familiar with the subject matter, might 
be interested in the methods that we 
recommend for teaching higher mathe- 
matics to future physicists and techni- 
cians. Our textbook is not designed for 
any particular category of students: 
it can be used in teaching mathematics 
or physics in high school, technic- 
al school, vocational school, or col- 
lege. High-school teachers will find 
it helpful in refreshing their ideas, 
obtained at college, about the relation- 
ships and interconnections between 
mathematics and physics. They can use 
it in elective classes or in physics and 
mathematics study circles; some parts 
of it can be conveyed to students di- 
rectly in the classroom. The college 
or university instructor can use the ma- 
terial in practical classes in the calcu- 
lus course (higher mathematics) or in 
the general physics course. Our book 
will also be useful to instructors during 
work on exercises in differential and 
integral calculus with first-year stu- 
dents of technical colleges or in the 
physics department, even if a different 
trend predominates in the lecture course 
in higher mathematics. Work on top- 
ics in Part 2 of the book (possibly Part 
3 as well) will help students to get a bal- 
anced idea of mathematics and facili- 
tate assimilation of certain parts of 


physics and technical subjects (the the- 
ory of oscillations, for instance). Most 
important of all, of course, is the fact 
that this book will help them not only 
to grasp the substance of differential 
and integral calculus but also to realize 
how and why calculus arose and why 
this key subject of mathematics is 
needed. 

The Selected Readings at the back of 
the book are intended for both students 
and teachers. Some of the books listed 
[4-6, 12-14] demonstrate possible ways 
to achieve an integrated approach to 
the teaching of higher mathematics and 
its applications. 

This book consists of three parts, the- 
first of which (Chapters 1 to 7) chiefly 
reminds readers of facts learned at 
school. Here we set forth the ideas and 
methods of higher mathematics. Part 2 
(Chapters 8 to 13) deals with applica- 
tions of higher mathematics in physics 
and engineering; it is a fundamental 
part of the book, unlike Part 3 (Chapters 
14 to 17), which is supplementary and 
can, in principle, be skipped. The in- 
separable connection between the first 
part of the book, devoted to mathema- 
tics, and the second part, dealing with 
physics, is underlined by the fact that, 
for instance, the section on Fourier se- 
ries is included in Part 2 (in the chapter 
on oscillations), while the process of 
water flow is studied in Chapter 6 de- 
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"voted to series and differential equa- 
tions. 

In writing this text we did not spe- 
cifically take into account the possi- 
ble use by present-day students of per- 
sonal computers. Naturally, these 
would further simplify the teacher’s ex- 
planations. For one thing, they make 
“numerical experiments,” such as the 
one in Section 4.7, much easier and fas- 
ter (and also, therefore, more natural). 
They enable students to employ numer- 
ical (approximate) differentiation and 
integration on a much broader scale. 
Computers should, naturally, also be 
taken into account when teaching stu- 
dents the applications of mathematics. 

The present volume is linked up with 
a book by one of the authors (Zeldo- 
vich). It came out in English in 1973 
( Higher Mathematics for Beginners and 
Its Application to Physics , Mir Publi- 
shers, Moscow). 1 One of the reasons for 
writting it was that in those years dif- 
ferential and integral calculus was not 
even mentioned in the Soviet Union’s 
high schools: the analytical section 
of the school mathematics course was 
limited to the fundamentals of algebra 
and of trigonometry, then still a sep- 
arate subject. As for the college course, 
this taught the fundamentals of cal- 
culus according to a system far re- 
moved from the needs of physicists and 
technicians. We remind our reader that 
the main college textbook in higher 
mathematics in the Soviet Union 
^hrmu^iroTit^hib UT?3bb"W'dsH . / Urnnivi 1 ln?"b 
well-known book, which had been trans- 
lated into Russian and then later re- 
vised by Academician N. Luzin. Revised 
editions of this simple book went 
through a large number of printings and 
were highly popular among engineering 
students. However, the harsh criticism 
leveled at the textbook by “pure” math- 
ematicians, who accused it of being 
slipshod, led to its complete replace- 
ment by textbooks of an altogether dif- 
ferent trend. In the 1950s it became 

1 The first Russian edition came out in 
1960, while the fifth and last, from which the 
English translation was done, came out in 1970. 


traditional to open the calculus course 
with the theory of limits based on a re- 
fined delta-epsilon technique using pro- 
found concepts whose value was at 
first only stated but not explained. All 
this created a situation in which the 
concepts of the derivative and integral, 
intuitively fairly clear and simple, 
seemed, to the uninitiated, to be some- 
thing very profound and aroused a kind 
of mystical awe (in the case of first-year 
students there was also the real fear of 
not passing the exam). Indeed, many 
mathematicians, it is to be regretted, pro- 
moted that feeling by stressing the logi- 
cal subtlety of the concept of limits and 
the need to use as a basis one of the va- 
riations of the theory of real numbers 
(though it is common knowledge that 
substantiations of this kind, of neces- 
sity fairly complicated, appeared only 
in the second half of the 19th century, 
that is, two centuries after the birth 
of higher mathematics). The students’ 
logical criteria were developed on the 
basis of theorems that seemed clear to 
them without any proof. For instance, 
the fact that a variable can have only 
one limit 2 was traditionally established 
without any regard for the intuitive 
idea of a limit, which clearly tells us 
that if a variable moves back and forth 
between two “limits”, it does not tend 
to either of them. 

The situation today is indisputably 
better. A course in the rudiments of 
differential and integral calculus is 
ahJiigvtorr^ hir^ii "SvhnnJis + hib 
Soviet Union. And among the new col- 
lege and university textbooks there are 
some based on an approach similar to 
ours. For example, the author of the 
textbook [91, which has gone through 
a number of editions, does not attempt 
to prove a single theorem on limits, pre- 
ferring, instead, to replace strict logic 

2 Compare this with a theorem from a pre- 
sent-day American school textbook on 
geometry: E.E. Moise and F.L. Downes, Jr., 
Geometry, 2nd ed. f Addison-Wesley, Menlo 
Park, Cal., 1971: p. 51, Theorem 2-3: “every 
segment has exactly one midpoint.” This is 
obviously intended only for future mathema- 
ticians. 



14 


Teaching Notes 


with a frank appeal to common sense. 
And even the textbooks which tradition- 
ally open with an exposition of the 
theory of limits are read in an entirely 
different way by students who learned 
the fundamentals of the derivative and 
the integral while still in high 
school. 

Nevertheless, we feel it necessary to 
detail the motivation behind our meth- 
od, especially since this method un- 
doubtedly does not meet with general ap- 
proval. Although elements of higher 
mathematics have won a firm place in 
high-school mathematics, the ques- 
tion of how these elements should be 
presented is far from settled. An “en- 
gineering” approach or a “natural sci- 
ence” approach is quite possible here, 
but some take a fundamentally differ- 
ent view. Nor can it be said that the 
principles underlying the mathematics 
course in technical and vocational 
schools are clear and definite, though we 
believe that a more rigorous treatment 
than the one applied in this book is 
hardly needed for these schools. 

Clearly, before starting to plan a sys- 
tem of teaching mathematics we must 
give thought to the reasons why we are 
teaching it. The answer to “how” can 
only be arrived at after we answer the 
question “why.” There can be the follow- 
ing reasons for studying mathematics: 

(1) to pass an exam; 

(2) to develop one’s capacity for abs- 
tract and logical thinking; 

(3) to learn to use computers; 

(4) to apply it in studying engineer- 
ing, physics, biology, the microcosm 
of atoms and elementary particles, or 
the macrocosm of stars and galaxies. 

Tnese aims h'ffrer grad fry, auli "firey 
necessitate completely different instruc- 
tional methods and systems. Indisput- 
ably practical though it is, the first 
of the above-listed aims can of course 
be dismissed at once. It is clear that 
the examiner must meet the needs of 
the students rather than the other way 
around (although, unfortunately, this 
is not always the case). The second aim 
cannot and should not be wholly ig- 


nored. Since it is connected with the* 
training of future mathematicians, the- 
growing importance of mathematics’ 
compels us to pay due attention to this 
point. But it would be absurd to main- 
tain that teaching shaped by aim (2) 
should be greatly expanded; after all, 
mathematicians will never comprise a 
significant part of the population. From 
this point of view the trend in schools in 
many countries to teach the new math,, 
understood as the fundamentals of sym- 
bolic logic and of the general theory 
of sets and the doctrine of abstractly 
introduced functions and so-called bi- 
nary relations, strikes us as harmful. 
Nor can aim (3) be the main guideline 
in planning large-scale mathematics 
instruction; the training of future prog- 
rammers and other computer experts 
is important, yet for the most part it is 
only a special case. 

Our book is meant for future engi- 
neers and natural scientists who want 
to know mathematics for aim (4). In 
our opinion, this undoubtedly large 
category of students should, as quickly 
as possible, be taught the information 
they need, without cluttering it up 
with superfluous logical subtleties or 
striving for maximum scope or a com- 
pletely rigorous approach. For them, 
after all, mathematics will always be a 
tool, a language with which to describe 
phenomena, and not something in which 
they are interested for itself. Moreover, 
we feel that, as was the case in the 19th 
century and will probably be the case in 
the 21st century too, the main stress 
should be put on the elements of mathe- 
matical analysis, differential and integ- 
ral calculus, and the simplest differ- 
mftrdi equations. 

The approach to the main concepts 
of analysis should proceed from the idea 
that Nature is arranged fairly simply, 
that “God may be subtle, but He is not 
malicious,” as Einstein put it. Hyper- 
trophy of the theorems of existence can 
only discourage students who are en- 
dowed with a healthy feeling for phys- 
ics. It is clear that a moving particle 
has a velocity, and that a curved line 



Teaching Notes 


15 


has a tangent; to demonstrate the exis- 
tence of velocity is just as unnecessary 
as it is to demonstrate that a flat figure 
has an area. Just take some paper or 
cardboard, cut out a figure and weight 
it, divide its mass by the mass per square 
meter of the material, and you have 
the area. In other words, in the initial 
stage of the study of analysis the exis- 
tence theorems of the derivative and 
integral are not needed (in fact, they 
are harmful): both of these concepts 
have a natural physical meaning, and 
hence they exist. 3 It would be absurd to 
start learning grammar by studying the 
exceptions instead of the general rules, 
or to start a language course by study- 
ing the grammar instead of first build- 
ing up a vocabulary. Yet that is the 
same, unfortunately, as the fairly wide- 
spread system of learning the “mathe- 
matical language” in which the concept 
(and properties) of the limit precedes 
any substantive application of this 
concept. 

A fundamental law of biology is that 
“ontogeny recapitulates philogeny,” 
that is, the development of an individua- 
al repeats, in one degree or another, 
<?lv ^ "gou x 

instance, at a certain period of its 
growth the human embryo has gill 
slits, because life originated in the sea 
and the ancestors of mammals, includ- 
ing man, were fish. We feel that teach- 
ers, too, should remember this law of 
biology. The future physicist or engi- 
neer should first study mathematics in 
the form in which it was created by the 
great scientists of the 17th and 18th 
centuries instead of receiving it immedi- 
ately in the form that had been worked 
out by the second half of the 20th 
century after a long period (Revolution. 

This, it goes without saying, should 
not be understood as a call to reproduce 

3 When students of the famous William 
Thomson, Lord Kelvin (1824-1907), tried to 
determine the derivative by the Cauchy pro- 
cedure, he grew irritated. “Oh, forget Totgent- 
ner”, he said. “The derivative is just the 
velocity.” (Totgentner was a professor of 
“pure” mathematics who in Kelvin’s time 
taught a course in calculus at Cambridge.) 


exactly, in present-day textbooks, tho 
fairly archaic reasoning of Newton, 
with its long-forgotten terms and sym- 
bols, or of Leibniz, with his characte- 
ristic “metaphysics of infinitesimals.’^ 
However, the ideas of a course for be- 
ginners could, it seems to us, be based 
on familiar classical graphic ideas in- 
stead of on the modern theories of real 
numbers or on strict definitions devel- 
oped over the centuries. Of course, 
there comes a time when the student 
has to be told about possible complica- 
tions, but not about exotic “continuous- 
functions that do not have any deriv- 
atives anywhere,” 4 but simply about 
the possibilities of a gap or a break in 
the curve. However, we believe such, 
warnings should be given not before 
but after the student has intuitively 
assimilated the basic ideas, just as it 
would be absurd to begin the study of a 
new mechanism by learning the instruc- 
tions about a possible breakdown. 
Furthermore, we are inclined to believe* 
that a far more instructive (and in- 
formative) approach than the tradition- 
al nihilistic is the modern “constru- 
ctive” approach to peculiarities of the* 

that every continuous function (even 
if it’s not smooth) has a derivative 
which, however, may prove to be a del- 
ta function instead of an ordinary func- 
tion; we devote much attention to- 
this approach in Chapter 16 of the book. 
We also take advantage of every oppor- 
tunity to stress such important general 
ideas as linearity and the principle of 
superposition , which are sometimes im- 
permissibly left in the shade in books 

4 Here we are inclined to concur with the* 
French mathematician Charles Hermite, one 
of the most prominent analysts of the second 
half of the 19th century, who said, in a letter 
to his friend T.J. Stieltjes, the Dutch analyst, 
that “I turn away with fright and horror from 
this lamentable evil of functions which do 1 
not have derivatives.” A book by the French- 
mathematician B. Mandelbrot ( Fractals , 
W.H. Freeman, San Francisco, 1977), which 
takes a diametrically opposite stand, is highly 
interesting but clearly intended for far more 
sophisticated readers than those for whom 
our elementary textbook is designed. 
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intended for physicists and engineers 
nr students training for these fields. 

Mathematics is developing by leaps 
and bounds, and so is the body of know- 
ledge needed by those who use it. 
Physicists and engineers have to be 
wa, today, than, w heinrn,. 
Some of the new scientific disciplines 
are named in the Conclusion entitled 
“What Next?”. Computational mathe- 
matics, based on the enormous poten- 
tials of the electronic computer, is hard- 
ly touched on in this book, but it too 
is acquiring tremendous importance. 
We think it quite possible to acquaint 
beginners more quickly with the initi- 
al sections of higher mathematics so 
that they can apply their knowledge 
immediately and have more time to 
learn new substantive mathematical 
theories. 

The pedagogical approach taken by 
the authors of many familiar calculus 
textbooks is reminiscent of the debates 
conducted by medieval Scholastics. 
Such authors regard the student as an 
experienced opponent eager to find weak 
points in the views of the teacher and 
they believe that the work of the teach- 
er is to refute all possible objections. 
We, on the contrary, regard the student 
as a friend and ally who is ready to be- 
lieve the teacher or the textbook and 
whose primary concern is to be able as 
quickly as possible to employ, in the 
study of nature and technology, the 
new methods learned . 1 1 is chiefly through 
an analysis of many examples and 
applications that the student comes to 
understand the substance of the new 
tools and gains the intuition that fore- 
warns when the tools refuse to function. 
The strictly logical approach bypasses 
the question of the origin, significance, 
and benefit of the concepts and 
theorems under study. This book focuses 
mainly on “rough” mathematical ideas 
and their connections with natural phe- 
nomena. 

We find support for our positions in 
the writings of many prominent scien- 
tists. The famous Russian naval archi- 
tect Aleksei Krylov (1863-1945), Mem- 


ber of the Academy of Sciences, out- 
standing scientist, technician and engi- 
neer whose work and achievements cov- 
ered an exceptionally broad range, 
author of the first Russian book on nu- 
merical methods in mathematics, trans- 
lator. of. Nawtoms. Mstih esn ati csiL, ?w? r 
ciples of Natural Philosophy , and an au- 
thority on celestial mechanics, wrote 
the following: “That on which Newton 
based the whole of the modern theory 
of the Universe and his incontrovert- 
ible proofs of the structure of the system 
of the world cannot be regarded as in- 
sufficiently rigorous for a 16-year-old 
high-school student.” Krylov point- 
ed out that beginners often accept 
the refined rigor of mathematical proofs 
as “a triumph of science over common 
sense.” In his well-known Autobio- 
graphical Notes Albert Einstein, one of 
the greatest physicists of all times, 
wrote: 

At the age of 12-16 I familiarized 
myself with the elements of mathemat- 
ics together with the principles of differ- 
ential and integral calculus. In doing so 
I had the good fortune of hitting upon 
books which were not too particular in 
their logical rigor, but which made up 
for this by permitting the main thoughts 
to stand out clearly and synoptically. 
This occupation was, on the whole, truly 
fascinating; climaxes were reached whose 
impression could easily compete with 
that of elementary geometry— the ba- 
sic idea of analytical geometry, the in- 
finite series, the concepts of the differen- 
tial and the integral.... 

A similar view was held by the distin- 
guished Soviet theoretical physicist 
Lev Landau (1908-1968), Lenin Prize 
and Nobel Prize winner, Member of 
the USSR Academy of Sciences, a man 
who combined outstanding intuition in 
physics with a high level of mathemat- 
ical technique. When asked for his 
opinion about a mathematics syllabus 
for a Moscow physics institute, he 
wrote: 

Unfortunately, your syllabus suffers 
from the same shortcomings that are 
common to the mathematics syllabuses 
that turn half of the study of mathe- 
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matics by physicists into a tiring waste 
of time. Important as mathematics is to 
physicists, what they need, as we know, 
is a calculating analytical mathematics; 
however, for reasons I cannot under- 
stand, mathematicians force logical exer- 
cises on us as an obligatory part of the 
course.... I think it is high time to teach 
physicists what they themselves think 
they need instead of trying to save their 
souls against their own wishes. I have no 
desire to discuss the idea worthy of medie- 
val scholasticism that people learn to 
think logically by studying things they 
do not need. I most emphatically main- 
tain that all theorems of existence, 
highly rigorous proofs, etc., should be 
completely expelled from the mathe- 
matics studied by physicists. 

These differences of opinion between 
mathematicians and physicists led to 
a paradoxical situation in which three (!) 
Nobel Prize winners in physics and 
chemistry (Walther Nernst, Svante 
Arrhenius, and Hendrik An toon Lorentz) 
found it necessary to write elementary 
textbooks in higher mathematics “for 
natural scientists and physicians,” as 
the title of the book by Arrhenius says, 
or “for natural scientists, with special 
attention paid to the needs of chemists,” 
as Nernst pointed out to his readers. 
The best known of these books is the 
Course in Differential and Integral Cal- 
culus, With Applications in Natural 
Science by H. A. Lorentz, which went 
through several editions between 1901 
and 1926 and was still highly popular 
in the 1930s as a textbook for future en- 
gineers and also for self-instruction. 
The ideas in the present book are fairly 
similar to those set forth in all the 
above-named books. 

It is worth noting that many “pure” 
mathematicians who do not special- 
ize in abstract algebra or mathematical 
logic and have a taste for applications 
think along the same lines as the phy- 
sicists mentioned in the above para- 
graph. For instance, the famous Ger- 
man (and later American) mathemati- 
cian Richard Courant (1888-1972), found- 
er of New York University’s Courant 
Institute of Mathematical Sciences, 
wrote in 1964 that mathematicians had 
for a very long time accepted Eucli- 


dean geometry as a model of a strictly 
logical approach, of strict logical dedu- 
ction. But here is what he says further 
(Scientific American , September 1964, 
p. 43) : 5 

But emphasis on this [axiomatic, log- 
ical] aspect of mathematics is totally 
misleading if it suggests that construc- 
tion, imaginative induction and combi- 
nation and the elusive mental process 
called intuition play a secondary role in 
productive mathematical activity or ge- 
nuine understanding. In mathematical 
education, it is true, the deductive meth- 
od starting from seemingly dogmatic 
axioms provides a shortcut for covering 
a large territory. But the constructive 
Socratic method that proceeds from the 
particular to the general and eschews 
dogmatic compulsion leads the way more 
surely to independent productive think- 
ing. 

All the above provides a sufficiently 
full explanation of the basic guidelines 
followed in our textbook. Its principles 
and underlying propositions coincide 
with those typical of the textbook 
Higher Mathematics for Beginners and 
Its Application to Physics', however, 
the details of the exposition differ sub- 
stantially from the old one, even in the 
sections that were part of the above- 
named book. There are many new top- 
ics here; this refers both to a number of 
particular points (the second differen- 
tial of a function of one variable or the 
total differential of a function of two 
variables; the idea of inertial refer- 
ence frames) and to entire large areas of 
science (Fourier series, the theory of 
functions of a complex variable, or the 
design of a laser). 

When using this textbook, the teach- 
er can freely draw on material from 
the physics sections to illustrate mathe- 
matical concepts and theorems or, con- 
versely, can include separate mathemat- 
ics topics in physics lessons. For ex- 
ample, in the text we illustrate the con- 
cept of the derivative by the connec- 
tion between the path traversed and 

5 Another excellent book, G. Polya’s 
Mathematics and Plausible Reasoning , 2 vols, 
(Princeton Univ. Press, Princeton, N.Y., 
1954), follows the same approach. 
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the speed, or between the quantity of 
heat needed to heat a body and the 
body’s heat capacity; but nothing pre- 
vents the teacher from referring here 
to, say, the connection between the elec- 
tromotive force and the current inten- 
sity for a coil (or for a capacitor, where 
the relationship is the reverse; cf. Sec- 
tion 13.1). It goes without saying that 
the number of such examples can be in- 
creased indefinitely. We believe that 
integrated teaching of calculus and its 
applications is very important for fu- 
ture technicians and physicists because 
a mathematical technique acquires deep 
meaning only in its application. The 
physics sections of the book are doubly 
beneficial: they not only give the stu- 
dent important information in physics 
but shed additional light on the mathe- 
matical constructions in the initial 
chapters and help the student to gain a 
new understanding of how natural and 
necessary these constructions are. Dur- 


ing instruction, part of this material 
can be omitted of course; however, it 
seems to us that some use of physical 
considerations in the exercises in mathe- 
matics (and partly in the lectures as 
well) is very useful, and that a well- 
thought-out coordination of the elemen- 
tary courses in mathematics and physics 
is absolutely essential. 

In conclusion we would like to call 
the attention of the teacher (and stu- 
dent) to textbook [11], which is close 
to our ideas in spirit but is more mathe- 
matical in content. The modern “com- 
puterization era” makes it necessary 
to mention another excellent (but more 
complicated) book [20], which is 
aimed at acquainting the reader with the 
applications of computers in mathe- 
matics and mathematical physics. 

Ya . B. Zeldouich 
I. M. Yaglom 
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Elements of Higher Mathematics 


Chapter 1 Functions and Graphs 


This chapter is of an introductory na- 
ture. It contains material that the majo- 
rity of the readers already know and is 
devoted to functional relationships be- 
tween quantities. In it the reader will 
find some specific functional relation- 
ships that we will often use in our nar- 
rative. With other important functional 
relationships the reader will get acquain- 
ted in the other chapters. 

1.1 The Functional Relationship 

Nature and technology abound in 
functional relationships , that is, rela- 
tionships between various quantities. 
It is natural then that mathematics 
pays so much attention to such rela- 
tionships. A functional relationship be- 
tween one quantity ( y ) and another 
(x) signifies that to every value of x 
there corresponds a definite value of y. 
The quantity x in this case is called the 
independent variable , and y is the func- 
tion of this variable. We also some- 
times say that x is the argument of the 
function y. 

Here are a few examples taken from 
geometry and physics. 

(1) The area S of a square is a func- 
tion of the length a of the square’s 
side: S = a 2 . 

(2) The volume V of a sphere is a 
function of the sphere’s radius R : 
V = (4/3) nR\ 


(3) The volume V of a cone with a 
given altitude h is a function of the 
radius r of the cone’s base: 

F = y:rt r 2 h. (1.1.1) 

On the other hand if we assume that r 
is fixed, then formula (1.1.1) expresses 
the volume V of the cone as the func- 
tion of its altitude h. 

(4) The distance z traversed by a free- 
ly falling body depends on the time t 
that elapses from the beginning of fall. 
This relationship is expressed by the 
formula 



where g ~ 9.8 m/s 2 is the acceleration 
of gravity. 

(5) The current i depends on the re- 
sistance R of a conductor: for a given 
potential difference u we have 



This list could be extended without 
end. 

In mathematics, functional relation- 
ships are ordinarily defined by formu- 
las, for example, 

y = 2x + S, y^x 2 + 5, 

y = 3x* — x 2 — x, (1.1.4) 

y= x ^j> y=V 3 *+ 7 . 
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(Here y is everywhere a function of ar- 
gument x ). A formula enables us to 
compute the values of the function for 
each given value of the independent 
variable. Sometimes this method re- 
quires using tables or other technical de- 
vices. For instance, in the last example 
in (1.1.4) we must extract the square 
root of a number to find the value of 
the function. This can be done via ta- 
bles of square roots or, even simpler, 
with a pocket calculator. 

In physics and technology, function- 
al relationships are usually recorded 
by measuring devices; say, the pointer 
of the speedometer in your car shows, 
together with the hand of your watch, 
the functional relationship of the car’s 
speed and time. But in this case, too, 
the relationship can often be represent- 
ed by a simple formula, which yields 
sufficiently accurate results. For instan- 
ce, if the potential difference in an elec- 
tric circuit remains constant, the ex- 
perimentally observed (via the scale 
of the ammeter) relationship between 
the electric current i and the resistance 
R of the circuit is described by formu- 
la (1.1.3) fairly well. If we have a for- 
mula that relates the values of quanti- 
ties that appear in physics and technol- 
ogy, we say that we have established 
a law for the phenomenon in question. 
In relation to formula (1.1.3) the law is 
Ohm's law. 

It is characteristic that in most cases 
in nature and technology the quanti- 
ty of interest (the function) depends on 
several other quantities. In Example 
(5) the current depends on two quanti- 
ties: the potential difference u and the 
resistance R of the conductor. The 
cone’s volume in Example (3) depends 
on altitude h and base radius r. Assum- 
ing all quantities except one to be 
given and constant, we study the de- 
pendence of the function on a single 
variable. In this book we will confine 
ourselves mainly to functions of one 
variable. For example, taking a given 
storage battery with a definite potential 
difference u, we will vary the resistance 
R of the conductor and measure the 


current i. In this experiment the cur- 
rent depends only on the resistance, the 
quantity u in (1.1.3) being regarded as a 
constant coefficient. 

Sometimes the formula gives only 
values of y for x varying within certain 
limits. Say, the formula y = Y x en- 
ables finding the values of y only for 
nonnegative values of x , that is, for 
x ^ 0. If y — log 2 {x — 2), then y 
exists only for x — 2 > 0, that is, 
for x > 2. In the formula y = (x — 
\)!{x + 1) in (1.1.4) we must as- 
sume that x — 1, while in the 
last formula in (1.1.4) we must assume 
that 3x + 7^ 0, that is, x^ — 7/3 
- 2.33. 

In formulas that appear in physical, 
engineering, geometric, and other prob- 
lems, the restrictions on the possible 
values of the independent variable 
sometimes follow from the very meaning 
of the problem considered. In the exa- 
mples at the beginning of this section, 
say, when calculating the area of a 
square and the volume of a cone, we 
assumed that the side of the square and 
the base radius of the cone and its al- 
titude are positive quantities. Often 
the formulas “tell” us about the possible 
restrictions on the values of the inde- 
pendent variable. For instance, the 
formula for the cone’s volume yields a 
negative value if we take h negative, 
which result is of course unnatural; 
in many formulas of the theory of rela- 
tivity, the speed v of a moving body 
appears in expressions of the type 
Y c 2 — v 2 , where c the speed of light, 
from which it follows that v < c; and 
so on. 

In the first three formulas of (1.1.4) 
and, obviously, in many other formu- 
las, no restrictions are imposed on the 
independent variable: x can be large or 
small, positive or negative. The same 
is true of many physical quantities; 
say, electric current can conveniently 
be assumed positive when the electrons 
move in a specified direction and nega- 
tive when they move in the opposite 
direction. 



1.1 The Functional Relationship 


2 & 


Knowing the formula that states the 
dependence of y on x , we can easily 
construct a table of values of y for sev- 
eral arbitrarily chosen values of x. 

By way of an illustration, we will 
set up a table for the function y = 
3x 3 — x 2 — x . The upper row contains 
the values of x that we choose, the 
lower row, under each value of x, con- 
tains the appropriate value of y: 


x —3 —2—10 1 2 3 

y = 3x*—x 2 — x —87 —26 —3 0 1 18 69 


Using this formula, we can make a 
more detailed table specifying, say, 
the values x = 0, 0.1, 0.2, .... Thus a 
formula is stronger, so to say, than 
any table. The formula not only con- 
tains the information necessary to com- 
pile the given table but also enables 
one to find the values of the function 
for values of the independent variable 
not contained in the table. On the other 
hand, a table is convenient in that it 
immediately gives the value of y for 
any given value of x, provided that the 
needed x is given in the table. A table 
is also more pictorial than a complicat- 
ed formula, by using which it is often 
difficult to estimate the values that the 
function assumes. However, a simple 
formula, with certain experience on 
the part of the researcher, provides 
a faster means for determining the be- 
havior of the function than the inex- 
pressive sequence of numbers in a table. 

In natural sciences and technology 
it often happens that the theory of the 
phenomenon of interest to us is lacking 
and the physicist (or chemist, biologist, 
engineer) is only able to supply exper- 
imentally obtained facts — the depen- 
dence of the quantity of interest upon 
the quantity that was given in the ex- 
periment. This is what happens, say, 
in studies of the relationship between 
the resistance of a conductor and the 
temperature of the conductor. Here the 
functional relationship can only be giv- 
en in the form of a table containing 
the experimental data. 


Experiments show that for a given 
conductor (of a given material, a given 
cross section, and a given length) thu 
electrical resistance depends on the 
temperature of the conductor. For each 
value of the temperature T, the conduc- 
tor has a definite resistance i?, so that 
we can speak of a relationship in which 
resistance R is a function of temperature 
T. Carrying out experiments, we can 
find the values of R for various T 
and thus find the dependence (relation- 
ship or function) R versus T, or R = 
R ( T ). Here, the results of experi- 
ments are given in the table below (va- 
lues of R for distinct values of T ): 


T (degrees 0 25 50 75 100 

Celsius) 

R (ohms) 112.0 118.4 124.6 130.3 135.2: 


If we are interested in the values of 
R for other temperatures which do not 
appear in the table, additional mea- 
surements are required because there is 
no exact formula defining the depen- 
dence of R on T. Practically speaking, 
however, we can always offer an approx- 
imate formula that is in good agree- 
ment with the experimental data at 
temperatures at which the measure- 
ments were made. Let us take, for ex- 
ample, the formula 

R = 112.0 + 0.272T — 0.0004T 2 

(1.1.5> 

and compile the following table using 
this formula: 


T (degrees 0 25 50 75 100 

Celsius) 

R (ohms) 112.0 118.55 124.6 130.15 135.2 


We see that the formula yields val- 
ues of R that are very close to the ex- 
perimental values for those tempera- 
tures at which the measurements were 
made, and so one is justified in assum- 
ing that at intermediate temperatu- 
res (say, at T = 10, 80, or 90 °C) the* 
formula will likewise give a correct 
description of the functional depen- 
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-denceofi? on T. Mathematicians say in 
this case that the established dependen- 
ce of R on T does not yield large errors 
in interpolation , which is a process in 
which we go from the known values of 
R to new, intermediate, values (the 
Latin word interpolare means to refur- 
bish, alter). The dependence of resis- 
tance R on temperature T found via 
formula (1.1.5) is called an empirical 
dependence , or an empirical formula . 
{The adjective “empirical,” from the 
Greek word empeiria meaning experi- 
ence, means relying on experience or 
observation, based on observation or 
experience, or capable of being veri- 
fied or disproved I:y observation or 
experiment.) However, an empirical 
formula, of course, requires verifica- 
tion, since the error introduced by using 
it may be considerable (clearly, the 
reliability of an empirical formula is 
the higher the thicker the grid formed 
by those values of the variable from 
which the empirical formula was con- 
structed). An empirical formula be- 
comes quite unreliable when we use it 
outside the range of the independent 
variable for which it was constructed 
(such continuation of the range of the 
empirical formula is known as extra- 
polation , from the Latin extra meaning 
outside); the errors in this case may be 
very large. For instance, we cannot ap- 
ply formula (1.1.5) outside the range 
•of the investigated interval, say, at 
T = —200 °C or at T = 500 °G, since 
there are no grounds to expect that for 
all T R(T) will be expressed by a 
quadratic trinomial in 7 1 , or formula 
(1.1.5). 

1.2 Coordinates. Distances and Angles 
Expressed in Terms of Coordinates 

Rectangular Cartesian coordinates 
are used as a pictorial way of represent- 
ing functional relationships by means 
of drawings (graphs). Draw two per- 
pendicular straight lines in a plane. 
Gall the horizontal line the x axis (also 
knownasthea#£so/a&sc£ssas), the ver- 
tical line the y axis (also known as the 
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Figure 1.2.1 

axis of ordinates ) , and the point of inter- 
section of the two lines (point 0 in 
Figure 1.2.1) the origin of coordinates , 
or simply the origin. It is customary to 
picture the plane with the x and y 
axes not flat on a table but vertically in 
front of the reader, just as if it were 
drawn on the wall opposite you. The 
arrow of the x axis is from left to right 
and the arrow of the y axis is pointed 
upward. These arrows show that, say, 
the segment OB of the x axis is taken 
to be positive if B lies to the right of 
point 0 (as in Figure 1.2.1) and nega- 
tive if B lies to the left of point 0\ simi- 
larly, the segment OC of the y axis is 
taken positive if C lies above 0 (as in 
Figure 1.2.1) and negative if C lies 
below 0. 

A definite pair of values of x and y , 
say, x = 2 and y — 4, is represented 
in the coordinate plane by a single 
point. To construct this point we plot on 
the axes of abscissas and ordinates the 
segments OB = 2 and OC = 4 (Fig- 
ure 1.2.1), with both segments being pos- 
itive because they are plotted from left 
to right and upward, respectively, and 
the x and y are positive. If we erect 
perpendiculars at points B and C on 
the x and y axes, the point at which 
they intersect is the sought point 
A(x = 2, y = 4) or, briefly, A (2,4). 
Conversely, to find the coordinates of 
an arbitrary point A it is sufficient to 
drop perpendiculars AB and AC on 
the coordinate axes and “read” the val- 
ues of x and y corresponding to points 
B and C (these values are equal to the 
lengths of segments OB and OC taken 
with the appropriate signs). The coor- 
dinate axes divide the plane of the draw- 
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Figure 1.2.2 

ing into four parts or quadrants, 

I, II, III, and IV, with the signs in 
each quadrant being as follows: (+, 
+), or positive abscissas and positive 
ordinates, in quadrant I, and ( — , +), 
(— , — ), and (+, — ) in quadrants 

II, III, and IV, respectively (Figure 
1.2.2). The axis of abscissas, Ox , cor- 
responds to the value y = 0, while the 
axis of ordinates, Oy, corresponds to 
the value x = 0. The origin 0 has coor- 
dinates (0, 0). Figure 1.2.3 shows a few 
examples of points for which the cor- 
responding values of the coordinates x 
and y are given in parentheses. 

An important piece of practical ad- 
vice: get into the habit of evaluating 
approximately the values of the coordi- 
nates of points in the plane and of find- 
ing the points from the known values 
of the coordinates. For practical work it 
is often convenient to use squared pa- 
per (it usually has a grid of millimeter 
squares) to locate points and plot 
curves. For instance, we advise the reader 
to try and determine the coordinates of 
points A to N in Figure 1.2.4 without 
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Figure 1.2.4 

resorting to additional lines and desig- 
nations. 

Thus, we can conclude that fixing two 
numbers, values of x and y, determines 
the position of a point in the plane. 
Therefore, all geometric quantities re- 
ferring to this point can also be ex- 
pressed in terms of the coordinates of the 
point. 

Let us find, for instance, the distance 
r from the origin to the point A with 
coordinates x and y, that is, the length 
r of the line segment OA joining the ori- 
gin 0 and the point A (Figure 1.2.5), 
and the angle a between the straight 
line OA and the axis of abscissas. From 
point A we drop the perpendiculars AB 
and AC on the axes of coordinates. The 
length of OB is equal to | x | , and 
that of segment AB is equal to the 
length of OC , or | y |. From the right 
triangle OAB, by the Pythagorean the- 
orem, we have 

(OA) 2 = r 2 = (OB) 2 + (AB) 2 

= x 2 + y 2 , 

or 

r = Yx 2 i-(/ 2 . (1-2.1) 

By the definition of the tangent, 

<‘- 2 - 2 > 

For example suppose that x = 2 and 
y = 3 (see Figure 1.2.5). Then 

r = Y 13 ~ 3.6, a — arctan ~ ~ 56°. 

Note that the angle a is reckoned from 
the positive direction of the x axis coun - 


Figure 1.2.3 
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Figure 1.2.5 

terclockwise . For this reason, if, say, x — 
— 2 and y = 2 (point ^4 in Figure 
1.2.6), the angle a is obtuse: than a — 
2/ — 2 = — 1 and a = 135°. When a point 
lies below the x axis, the angle a 
proves to be greater than 180°. In Figure 
1.2.6 we have two instances of this 
case: point# (2, —2) for which a = 315° 
and point C ( — 3, — 3) for which a = 
225°. Since such large angles often 
prove to be inconvenient, it is customa- 
ry to reckon them from the same posi- 
tive direction of the x axis but clock- 
wise, taking a negative. For point B 
we then have a = —45°, and for point 
C we have a — —135°. In other words, 
we can assume that for any point the 
angle lies either within the range from 

0 to 360° (without the value of 360°, 
since it is more convenient to take a = 
0° instead of a = 360°) or within 
the range from —180° to 180°, with the 
values a = — 180° and a = 180° be- 
longing to points on the negative “half” 
of the x axis (here both values of a 
have an equal status). 

In this respect formula (1.2.2) is 
incomplete since it does not enable us 
to determine whether a point lies in the 

1 or III quadrant (or in the II or IV 



quadrant) — in these two quadrants the 
tangent has the same sign. To deter- 
mine a completely, we must take into 
account the signs of x and y , which make 
it possible to determine the quadrant 
in which the particular point lies, or to 
use more complete formulas than 
(1.2.2), which are also more complicat- 
ed: 

cos a = y , sina^y-, r = Yx 2j r y 2 . 

(1.2.2a) 

It is easy to solve the inverse problem: 
suppose we are given a point A at 
a given distance r from the origin O , 
with the line segment OA forming an 
angle a with the x axis (the positive 
direction of the x axis is assumed as 
usual). It is required to find the coor- 
dinates of point A. Looking at Figure 
1.2.5, we see that 

i-rcosa, sin a (1.2.3) 

These formulas are valid, without ex- 
ception, for arbitrary positive and neg- 
ative angles a and yield the proper 
signs of x and y in any quadrant. 

Clearly, the position of a point A 
in a plane can be fixed by specifying 
the Cartesian coordinates x and y\ 
however, instead of these two numbers 
we can use the distance r and the angle 
a. These two numbers, r and a, are 
known as polar coordinates of point A y 
point O is called the pole of polar coor- 
dinates in the plane, and the ray Ox 
is called the polar axis (theOy axis does 
not take part in the definition of polar 
coordinates). Thus, formulas (1.2.2a) 
and (1.2.3) define the transition from 
rectangular Cartesian coordinates to 
polar coordinates and vice versa. 

Let us now examine problems involv- 
ing two points, A x and A 2 . We denote 
the coordinates of the first point by 
x x and y x , and the coordinates of the 
second point by x 2 and y 2 (Figure 1.2.7). 
We wish to find the distance r l2 between 
these points and the angle a l2 between 


Figure 1.2.6 
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Figure 1.2.7 

line segment A r A 2 and the x axis. 1 - 1 

It is convenient to draw through A 1 
a straight line parallel to the x axis and 
through A 2 & line parallel to the y 
axis (in Figure 1.2.7 they are shown as 
dashed lines and their point of inter- 
section is B). In the triangle A X A 2 B 
the line segment A X B is equal to 
| x 2 — Xi | and the segment A 2 B is 
equal to j yz — lh I • The construction 
of triangle A X A 2 B is similar to the 
construction of triangle OAB in Figure 
1.2.5. 

By the Pythagorean theorem, 

r i2 = V (?2 ~ *i) 2 + (5/2 — J/i) 2 - (1-2.4) 

The angle a 12 is found from the condi- 
tion 

tan a 12 — » d-2-5) 

or (cf. (1.2.2a)) 

_ x 2 — X 1 _ X 2 — #1 

C0S “12 r l2 y ^ x ^J r ^y 2 y^2 ’ 

sin a ia = v*-y* . 

r i2 V (z 2 — a?i) 2 -h(i/2 — !/i) 

(1.2.5a) 

1,1 The subscripts on the letters are known 
as indices (the Latin word index means to 
indicate). They are read: “x sub one” for x 11 
A sub two” for A 2 . The same letter with 
different indices is used in place of a variety 
of letters to emphasize that we are dealing 
with similar (yet different) quantities. For 
instance, x 1 and x 2 are quantities on the x 
axis (both are abscissas), but they refer to 
distinct points. Quantities denoted by deffer- 
ent letters but the same index refer to one 
and the same point: A 1 denotes a certain 
point, x 1 denotes the abscissa of that point, 
and y 1 denotes the ordinate of that same point. 
We sometimes use double-index notation: 
r 12 , which is read “r sub one two” and not 
“r sub twelve”, is the distance between the 
first point, A lt and the second, ^4 2 . 



The reader should assure himself 
that the formulas (1.2.4), (1.2.5), and 
(1.2.5a) hold true for arbitrary signs 
of all four quantities x l9 y l7 x 2 , y 2 
and for any relationships between the 
coordinates: x 1 > x 2 or x { < x 2 , y { > 
y 2 or y x < y 2 . For example, in Fig- 
ure 1.2.8 we have the case x 1 < 0 and 
x 2 > 0, with A 1 = A 1 (—2, 1) and 
A 2 = A 2 (3, 3). The length of segment 
A X B is equal to the sum of the absolute 
values | x 1 j = 2 and | x 2 \ = 3, which 
is strictly in accord with the general 
formula A X B = x 2 — x x = 3 — (—2) = 
5. Consequently, the expressions for 
r l2 and than a l2 are also correct. 


Exercises 

1.2.1. Plot the points (1, 1), (—1, 1), 
(— 1, —1), and (1, —1). 

1 .2.2. Plot the points (1, 5), (5, 1), (—1, 5), 
(-5, 1), (-5, -1), (1, -5), and (5, -1). 

1.2.3. Plot the points (0, 4), (0, —4), 
(4, 0), and (—4, 0). 

1.2.4. Find the distance from the origin 
and the angle a for the points (1, 1), (2, —2), 
(—3, — 3), and (—4, 4). 

1.2.5. Find the distance between the follow- 
ing pairs of points: A x (i, 1) and ^l 2 (l, — 1), 
A^i, 1) and A 2 (— 1, —1), A x (2, 4) and 
A 2 ( 4, 2), and ^(—2, -4) and i4 2 (— 4, —2). 

1.2.6. Write out the coordinates of the 
vertices of a square with side a if the diago- 
nals of the square coincide with the x and y 
axes. 

1.2.7. Write out the coordinates of the 
vertices of a regular hexagon with side a if 
one of the diagonals coincides with the x 
axis and the center lies at the origin. 

1.2.8. (a) Write out the coordinates of the 
vertices of an equilateral triangle with side 
a, with the base on the x axis, and with the 
vertex of the subtended angle on the y axis; 
(b) the same if the base lies on the x axis and 
the vertex of one of the angles lies at the 
origin. 
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1.2.9. Given a point A x with coordinates 
x 1 and y x . Write out the coordinates of point 
A 2 symmetric to A x about the x axis; the 
same for A z symmetric to A x about the y 
axis; the same for A 4 symmetric to A x with 
respect to the origin. 

1.3 Graphical Representation 
of Functions. The Equation 
of the Straight Line 

In Section 1.2 it was shown that each 
pair of values x, y is associated with a 
definite point in the plane. If it is giv- 
en that y is a definite function of x, 
then this means that to every value of 
x there corresponds a definite value of 
y. Therefore, if we are given a range of 
values of x , we can find the various 
corresponding values of y , and these 
pairs of values, (x, y), will yield many 
points in the plane. If we increase the 
number of distinct values of x by tak- 
ing them closer and closer together, 
then finally the points will merge into 
a solid curve. This curve is called the 
graph of the function. 

Actually, only a few points suffice 
to plot a graph, the intermediate points 
and the whole graph (curve) of the func- 
tion being obtained by joining the 
points with a smooth curve. However, 
in order to avoid crude errors, we must 
have a general picture of the form of 
curves representing various functions. 
We begin with a few of the more typical 
and important functions. 

We consider the linear relationship , 
or linear function , 

y = kx + b. (1.3.1) 

Suppose, say, y = 2x + 1. We con- 
struct a few points for which x and y 
are given in the table below: 

X 0 12 3 

y 13 5 7 

Now plot these points on the graph in 
Figure 1.3.1. It is immediately seen 
that these points lie on one straight 
line. In this case, we draw the straight 
line (whence the term “linear relation- 
ship” or “linear functions”) and obtain 



Figure 1.3.1 

the entire graph of the function: for 
any x, the corresponding point (x, y) 
lies on the straight line that connects 
any two points of the graph. 

How do we prove that, for any func- 
tion of the form y — kx + b (for 
arbitrary k and b), all points of the 
graph lie on a single straight line? 
In other words, how can we determine, 
without construction, merely by com- 
puting from the values of coordinates, 
that three points, A x , A 2 and A 3l 
lie on a single line? It is clear that if 
the angle a l2 between the segment 
A X A 2 and the x axis is equal to the 
angle a 13 between the segment A X A 3 
and the x axis, this means that the line 
segments A X A 2 and A X A 3 belong to one 
straight line, and if this is not so, then 
the segments belong to different 
straight lines. In Figure 1.3.2 we have 
the case where a 13 is greater than a l2 
and point A 3 lies above the extension 
of A 4 A 2 , the same figure shows that if 
a 13 were equal to a 12 , then A 3 would be 



Figure 1.3.2 
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on the straight line that is the exten- 
sion of A t A 2 . 

From the expression of the tangent of 
angle a 12 (1.2.5), it follows that at 
<*12 — a i 3 we have the following rela- 
tionship between the coordinates of 
points A x , A 2 , and A 3 : 

V2 — y i Vs — Vi u g 2) 

x 2 — x 1 x 3 — x 1 \ / 

Without using trigonometry, we can 
say that condition (1.3.2) is a condition 
of similarity of two right triangles 
A x A 2 B and A X A 3 C (see Figure 1.3. 2). The 
similarity of the triangles indicates that 
the angles at the vertex A x are equal. 

The relation (1.3.2) is also applic- 
able in the case where point A x lies be- 
tween points A 2 and A 3 (Figure 1.3.3); 
if the three points lie on one line, then 
from the similarity of the triangles 
A X A 2 B and A X A 3 C follows the propor- 
tion (1.3.2). In the example given in 
Figure 1.3.3, x 3 — x x < 0 and y 3 — 
y x < 0, but their ratio is positive and 
equal to the ratio of two positive quan- 
tities, x 2 — x t and y 2 — y 1 . 

Now let us verify that condition 
(1.3.2) remains valid for any triple of 
points of the linear function (1.3.1). 
Consider two points, A(x x , y^) and 
B(x 2 , y 2 ), whose coordinates obey 
Eq. (1.3.1). In this case y 1 = kx x + 
b and y 2 = kx 2 + 5, with the re- 
sult that y 2 — y± = kx 2 + b — (kx x + 
b) = k (x 2 — x x ), whence 

y% y i = Jc 

X 2 — Xi 

The ratio proves to be independent of 
x 1 and x 2 . Hence, for any other pair 



of points of the graph, in particular for 
the points A(x x , y x ) and C(x 3 , y 3 )+ 
we also get 

y »— yi —.j c — ya— y\ 

x 3 — X 1 x 2 — X 1 

This means that for any three points 
of the graph, A{x ly y x ), B(x 2 , y 2 ), 
and C(x 3 , y 3 ), the relation (1.3.2) is 
valid, which means that any three 
points of the graph lie on a single straight 
line and, hence, all points of the graph 
of the function y = kx + b belong to 
one straight line. Thus, the graph of 
the function y — kx + b is a straight 
line, which we will often call, for tho 
sake of brevity, “the straight line y — 
kx + 6” (we will also call it “tho 
straight line (1.3.1)”). 

The equation y = kx + b is called 
the equation of the straight line. The 
coefficient k determines the angle betwe- 
en the straight line and the x axis. Sub- 
stituting x = 0 into the equation yields 
y = b, which means that one of the 
points of the straight line is the point. 
(0, b). This point lies on the y axis at 
a distance b above the origin if b is 
positive or at a distance b below the ori- 
gin if b is negative. Thus, b is the ordi- 
nate of the point of intersection of the 
straight line and the y axis (sometimes 
b is called the initial ordinate of tho 
straight line), and | b \ is the length 
of the line segment cut off by tho 
straight line on the axis of ordinates 
(in Figure 1.3.1, b = 1), which is called 
the y intercept. 

To construct a straight line corre- 
sponding to a given equation one need 
not compute the coordinates of a largo 
number of points and plot them on tho 
graph: it is clear that the construction 
of two points fully determines tho 
straight line passing through these two 
points. For instance, we can always 
take two points, y = b for x = 0 and 
y = b k for x = 1, and draw tho 
line. For the second point we could also 
take the point of intersection of the 
straight line and the x axis, that is, tho 
point with x — x 0 and y = 0. From tho 
condition y — kx 0 + & — 0 we find 
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that Xq = — b/k, which is known as the 
x intercept. 

It is useful to do some drilling in the 
construction of graphs so as to be able 
to glance at an equation and picture 
roughly the variation and the position 
of the curve in question. 

This is easy to do when we have a 
linear function whose graph is a 
^rtaight line. The line depends only on 
two quantities, k and b, of the equation. 
Thus, not so many variants have to 
be examined: k can be positive or neg- 
ative, k can be large or small in ab- 
solute value (greater than 1 or less 
than 1), and 6 can be positive or neg- 
ative or even zero. Let us see how to 
carry out such an investigation. 

We start with the case 6 = 0, or 
the equation y = kx. The straight line 
here will clearly pass through the ori- 
gin, that is, through the point with 
x = 0 and y = 0. Figure 1.3.4 depicts 
several straight lines with different 
k' s, whose values stand at the end- 
points of each straight line. Check the 
correctness of each line and you will 
feel sure of the following general con- 
clusions: 

(1) If k is positive, the straight line 
lies in the first and third quadrants, 
while if k is negative, the straight line 
lies inTthe second and fourth quadrants. 

(2) By the foregoing, if k — 1, the 
line lies in the first and third quadrants. 
The part of the straight line in the first 
^quadrant forms an angle a = 45° with 


the x axis, which means that it bisects 
the angle between the x and y axes. 
The “angle with the x axis” here stands 
for the angle with the positive direc- 
tion of the x axis (the one with the ar- 
rowhead). An extension of the straight 
line lying in the third quadrant forms 
an angle a = — 135° with the x axis. 
The entire straight line bisects the 
angle between the x and y axes in the 
first and third quadrants. 

(3) For k = — 1, the part of the 
straight line lying in the second quadrant 
forms an angle a = 135° with the x 
axis, while the extension of the line 
in the fourth quadrant forms an angle 
a — — 45°. The entire straight line 
bisects the angle between the x and y 
axes in the second and fourth quadrants. 

(4) If | k | < 1, the straight line is 
sloping, that is, is closer to the x axis 
than to the y axis, and the smaller the 
| k |, the closer the line is to the x 
axis. If | k | > 1, the straight line 
is steep, it is closer to the y axis than 
to the x axis, and the greater the | k |, 
the closer the line is to the y axis. 

The quantity k is called the slope 
of the line; it is fixed by the value of 
angle oc between the straight line and 
the x axis. 

Now that this is clear, let us inves- 
tigate the general case of a straight 
line with 6 different from zero. Sup- 
pose we have two straight lines: y = 
kx (say, with k = 0.5; see Figure 
1.3.5), that is, 6 = 0 in Eq. (1.3.1), 
and the straight line with the same 
slope k but with 6 =£■ 0 (say, y = 
0.5a; + 2 in Figure 1.3.5). For the 
sake of convenience we introduce the 
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notation y 0 = 0.5x and y 2 = 0.5x + 
2. 12 For each given x , the quan- 
tity y 2 is two units greater than y 0 . 
To summarize, then, the points of 
line y 2 are obtained from the points of 
line y 0 with the same x by an eleva- 
tion of two units. The straight line y 2 
is therefore parallel to y 0 and lies two 
units above it. Quite obviously, this 
rule holds true for any b (if b is nega- 
tive, the line lies below the origin and 
below the corresponding straight line 
y = kx). 

Now that we see how straight lines 
with equations y — kx are located for 
distinct ft’s, we can readily picture the 
general positions of straight lines y = 
kx + b with arbitrary ft and b. Exer- 
cises that will help you to drill this 
material are given at the end of this 
section. In the particular case of ft = 0 
the equation is y = b (it is assumed 
that y = b for all values of x), which 
is associated with a horizontal straight 
line with a slant (slope) of zero (Fig- 
ure 1.3.6). 

We can imagine a man walking from 
left to right in the direction of increas- 
ing values of x. If ft > 0, then he walks 
uphill (a positive slope), while if ft < 0, 
the man walks downhill (a negative 
slope); if ft = 0 (zero slope), the man 
walks along a horizontal path. 

The slope ft indicates the ratio of the 
variation of the function to the respec- 

1,2 The subscripts here are used somewhat 
differently from our earlier practice: y 0 refers 
to the entire line and not to the ordinate of 
a point— it is the ordinate of an arbitrary 
point on a line with given k and 6 = 0. In 
the same manner, y 2 is the ordinate of an 
arbitrary point on a line with given k and 
6 = 2. In other words, neither y 0 nor y 2 
are numbers; they are functions: y 0 = y 0 (x) 
and y 2 = y 2 (x). 



tive variation of the independent vari- 
able. Indeed, for any two points on the 
straight line, A(x 1? y-f) and B{x 2 , y 2 ), 
we have 

y te) — y (si) „ fcg 2 +6 — (toi+6) __ ^ 

# 2 — *^2 — #1 

We have already calculated this ratio 
when we proved that a linear function 
on a graph is depicted by a straight 
line. In the general case of an arbit- 
rary function y (x) the similar quan- 
tity, [y (x 2 ) — y (. x 1 )\l(x 2 — xfj is the 
tangent of the angle between the seg- 
ment connecting points (x u y (^)) and 
{x 2 , y (x 2 )) and the x axis (Figure 1.3.7a). 

A linear function is distinguished by 
the fact that this ratio is the same for 
any two points: it depends neither on 
x 2 nor on x x (Figure 1.3.76). For this 
reason, all points of a linear function 
belong to a single straight line. 

Note, in addition, that a straight line 
parallel to the y axis (such a straight 
line does not represent any function) 
is written as x — a (it is assumed that 
x = a with y arbitrary; see Figure 1.3.8, 
where a is negative). It is natural to 
assume that the ft of such a straight line 
is infinite (ft = oo), since in this case 
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the ratio k = (y 2 — yi)/(x 2 — 24 ), where 
points (#!, i/x) and (x 2 , 1 / 2 ) belong to 
the straight line, has the form c/0 (the 
straight line x = a is one with an in- 
finitely great slope). 

Now we can easily find the equation 
of a straight line passing through a giv- 
en point M (£ 0 , y 0 ) and having a given 
slope k. Such an equation will have the 
form of (1.3.1), with k known, but b 
has yet to be found, and the values x 0 
and y 0 must satisfy this equation (Fig- 
ure 1.3.9). This suggests the form of 
the sought equation: 

y — y 0 = k (X — X 0 ), (1.3.3) 

or 

y = kx + (y 0 — kx 0 ). (1.3.3a) 

Indeed, the slope of the straight line 
(1.3.3a) or (1.3.3) is k; on the other 
hand, if we substitutes = x 0 and y = y 0 
into both sides of Eq. (1.3.3), we ar- 
rive at an identity, 0 = 0. 

Exercises 

1.3.1. Determine whether the point tri- 
ples lie on a straight line: A 1 (0, 0), A 2 (2, 3), 

A 3 (4, 6); A 1 (0, 0), A 2 (2, 3), A 3 (-2, -3); 
and A x ( 2, —3), A 2 (4, —6), A 3 (— 2, 3). 

1.3.2. Construct the straight lines y — 3x, 
y = 3x + 2, y = 3x — 1, y — 2 — x, y = 

2 — 0.5a:, and y = x — 3. 

1.3.3. Find the equation of (a) a straight 
line that passes through point A (1, — 1) and 
has a slope equal to —1, and (b) a straight 
line that passes through point B (2, 3) and 
has a slope equal to 2. 

1.3.4. Prove that the equation of each 
straight line that has a nonzero slope (this 
slope must not be infinitely large), that is, a 
straight line not parallel to either axis, can be 
written in the form x!a + y/b = 1 (the inter- 
cept form). What geometrical meaning have 
the quantities a and b in this equation? 


1.4 Inverse Proportionality and 
the Hyperbola. The Parabola 

Let us recall the idea of direct propor- 
tionality of two quantities. The formu- 
la 

y = kx (1.4.1) 

(geometrically, as we already know, 
this formula represents a straight line) 
means that y is proportional to x: an 
increase in x by a certain factor leads 
to an increase in y by the same factor. 
In other words, for any two points 
A(x i, y x ) and B(x 2 , y 2 ) whose coor- 
dinates satisfy (1.4.1) we always have 
yJyi = x 2 /x 1 (Figure 1.4.1a). It is also 
commonly said that formula (1.4.1) 
expresses the fact of direct proportion- 
ality between y and x (with the propor- 
tionality factor, or coefficient, k). The 
more general equation 

y = kx + b (1.4.1a) 

characterizes the proportionality (or 
direct proportionality) between the in- 
crement x 2 — x x of the independent vari- 
able and the corresponding increment 
y 2 — yi of the function (dependent vari- 
able): for any two increments x 2 — x x 
and x 4 — x 3 of the independent vari- 
able of the function (1.4.1a), the cor- 
responding increments of the function 
(dependent variable) will be propor- 
tional to the increments of the inde- 
pendent variable (Figure 1.4.1&): 

1/2 — yi _ y 4 Us 
x 2 — X 1 x \ — x 3 
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Another often encountered relation- 
ship between y and x is 



Such a dependence is known as in- 
verse proportionality (with coefficient 
k) between y and x: if points A(x x , z/ x ) 
and B(x 2 , y 2 ) satisfy Eq. (1.4.2), then 
x 2 is greater than x x by a factor by 
which y 2 is less than y x , or x 2 /x 1 = yjy 2 
(since y 2 -r- y x — klx 2 — k!x x — 

x i -T- #2)* 

Note that direct and inverse pro- 
portionalities are reciprocal : if y is di- 
rectly (inversely) proportional to x , 
then x is directly (inversely) propor- 
tional to y, and vice versa. However, 
while the proportionality factor in 
direct proportionality (the factor that 
connects x with y) is the inverse of the 
proportionality factor in the (direct) 
proportionality relation that connects 
y with x (if y = kx, then a; = (1 Ik) z/),in 
the inverse proportionality between y 
and x the coefficients of (inverse) pro- 
portionality connecting y with x and x 
with y are the same (if y — klx, then 
x = k/y). 

The curve corresponding to Eq. (1.4.2) 
is known as the hyperbola. 1 * Figure 

We are speaking here only of equilateral 
hyperbolas , since by “hyperbola” mathemati- 
cians mean a somewhat more general curve than 
the one specified by Eq. (1.4.1). However, 
nowhere in our narrative do we use such 
general hyperbolas, and so the adjective 
“equilateral* will always be dropped. 


1.4.2 depicts a “unit” hyperbola 



and below we give the values of y for 
some values of x for this hyperbola: 


x —1 —0.1 —0.01 —0.001 0.001 0.01 0.1 1 

y —1 — 10 —100 —1000 1000 100 10 1 


The hyperbola has the peculiarity that 
for small positive x the value of y is 
very large, while for small negative x 
the value of y is also negative and very 
large in absolute value. 

This property of curve (1.4.2a) (or the 
general curve (1.4.2)) is often expressed 
by stating that at x = 0 we have 
y — 00 , where the plus or minus sign 

is chosen depending on the side from 
which we approach x = 0. The exact 
meaning of “y = ±00 at x = 0” is 
that for a sufficiently small x the value 
of y become arbitrarily large (in abso- 
lute value), with it being positive or 
negative depending on the sign of x 
(the symbol 00 is read “infinity” and 
is of course not a number). The smaller 
the value of x we choose (in absolute 
value, of course), the higher (or the 
lower in the case of x < 0) the cor- 
responding point lies on the hyperbola: 
both branches of hyperbola (1.4.2a) 
“go off to infinity” (in the positive or 
negative direction) along the y axis 
(upward or downward) and approach 
the y axis without limit (but never in- 
tersect it), since the abscissa can be 
arbitrarily small. Similarly, as x in- 
creases without limit (in absolute 
value), that is, when x “tends to + 00 ” 
or “tends to — 00 ” (we remind the read- 
er once more that the symbol 00 does 
not correspond to a number), the value 
of y = l/x becomes arbitrarily small 
(in absolute value): the hyperbola “goes 
off to infinity along the x axis”, appro- 
aching this axis without limit but never 
intersecting it (the two branches ap- 
proach the x axis from different di- 
rections). This property of the hyper- 
bola is expressed briefly by saying that 
the straight lines x = 0 (the y axis) 
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and y = 0 (the x axis) are the asymp- 
totes to the hyperbola (from the Greek 
word asymptotos meaning “not meet- 
ing”). 

As seen from Figure 1.4.2, the hyper- 
bola (1.4.2a) consists of two parts, or 
branches, corresponding to x > 0 and 
x < 0; these branches are separated, 
that is, they do not intersect. 

The arbitrary curve (1.4.2) can be 
obtained from the curve (1.4.2a) by 
a simple transformation. Suppose that 
A 0 (^o> yo) is a point of the curve 
(1.4.2a), that is, y 0 = 1 lx 0 (see Fig- 
ure 1.4.3., in which the curve (1.4.2a\ 
is shown by a dashed curve). In this 
case the same value x — x 0 has corre- 
sponding to it on curve (1.4.2), where 
we assume that k is positive, a point A 
with coordinates x 0 and y = k/x 0 = 
k(ilx 0 ), that is, A(x = x 0 , y = 
ky 0 ). This point was obtained from 
point 4 0 by a “stretching” transforma- 
tion along the y axis (or to put it dif- 
ferently, by a “stretching” transfor- 
mation away from the x axis) with a 
transformation coefficient equal to k, 
that is, a fc-fold increase in all vertical 
dimensions — it is farther away from 
the x axis than point A 0 by a factor of k. 

Here we assume that k > 1; when k 
is positive but less than unity, the 
curve (1.4.2) is obtained from the curve 
(1.4.2a) by a “shrinking” transforma- 


tion along the y axis (or by a “shrink- 
ing” transformation toward the x axis), 
since each point of the curve (1.4.2) 
is closer to the x axis than the respec- 
tive point of the curve (1.4.2a) by a 
factor of k (in this case the ratio y -r- 
y 0 = AP -f - A 0 P = k is less than 
unity). 1 - 4 

Figure 1.4.3 shows the curve # = 1/x 
(the dashed curve), the curve y = 21 x 
(the solid curve), and the curve cor- 
responding to the function y = ( — H2)lx 
which corresponds to a negative value 
of k equal to — 1/2 (the dotted curve); 
this last function is negative for posi- 
tive x and positive for negative x . 

The direct and inverse proportion- 
alities are often encountered in physical 
laws. For instance, Ohm's law (1.1.3) 
states that current i in a conductor 
changes in direct proportion to the po- 
tential difference u and in inverse pro- 
portion to the resistance R of the con- 
ductor. For a given i?, the current i 
is directly proportional to u (with a 
proportionality coefficient equal to 1 IR)\ 
on the other hand, for a given u the cur- 
rent i is inversely proportional to i?, 
or i = k/R, where potential difference 
u plays the role of k. A similar simple 
relationship s = ut exists between the 
distance s traveled in uniform motion 
with a velocity v in the course of a time 
interval t. It shows that t = s/v., or 
that the time of traversal is directly 
proportional to distance s and inverse- 
ly proportional to velocity u. Fi- 
nally, in Boyle’s law, pressure is in in- 
verse proportion to volume of gas. 
Examples abound. The formulas s = 
vt and t = s/u show that, in uni- 
form motion, the distance s traveled is 
proportional to the time t of traversal 
with the coefficient of proportionality 
equal to v, while t is proportional to s 
with coefficient llv, that is, if, say, 
v — 20 m/s, then the above-noted coef- 
ficients of proportionality are 20 m/s 
and 0.05 s/m, respectively. The situ- 
ation with Boyle’s law pV = constant 


1 » 4 On the connection between curves 
(1.4.2) and (1.4.2a) see Exercise 1.4.3. 
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is different. Here, if the temperature 
is 0 °C and the volume V and pressure p 
are measured in m 3 and Pa (pascal; 
see Appendix 5: 1 Pa = 1 N/m 2 = 
1.450377 X 10~ 4 lb (wt)/in. 2 ), it can 
be proved (see Section 11.2) that pV = 

0.8737 N*m, and we see that p is 
inversely proportional to V, and V 
is inversely proportional to p , and in 
both cases the coefficients of proportion- 
ality are equal to 0.8737 N*m. 

These examples proved a good illus- 
tration of the difference between the 
coefficients in direct proportionality 
y = kx and inverse proportionality y = 
kjx. If x is measured in units of e 1 
and y in units of e 2 , then k has the di- 
mensions of e 2 le i, whereby there is no 
way in which k can serve as the coef- 
ficient k' in x = k'y, since the latter 
coefficient must be measured in units of 
eje 2 . The situation is opposite with k ly 
which has the dimensions of e 1 e 2 . This 
implies that ^ may (and actually 
does) coincide with the coefficient k[ 
in x = k'Jy. 

Let us now turn to the quadratic 
function y = ax 2 + bx + c. We start 
with the simplest case: 

y = ax 2 , (1.4.3) 

where coefficient a may be arbitrary. 
For the sake of simplicity we take 
a — 1, that is, we consider the curve 1 * 5 

y = x 2 . (1.4.3a) 

What general properties does this 
function have? 

1. It is always true that y > 0, both 
for x > 0 and for x < 0. Only for x = 0 
do we have y = 0. This means that 
the entire curve lies above the x axis and 
touches the x axis only at the origin. 

2. y has a minimum (smallest value) 
at x = 0. The minimum is equal to 0. 
On the graph, the minimum is the low- 
est point of the curve. 

3. Associated with two values of x 
identical in absolute value but with 
opposite signs are values of y identical 


1,5 In a certain sense this case can be 
thought of as general (see Section 1.8). 



Figure 1.4.4 

both in sign and absolute value. This 
means that the curve is symmetric 
about the y axis . 

The curve (1.4.3a) is shown in Fig- 
ure 1.4.4 by a solid line. It is called 
a parabola , and this term will be used 
for all curves (1.4.3) with arbitrary a 
(and for curves of a broader class, as 
we will see below). For every positive 
value of a, the curve (1.4.3) possesses 
the same properties 1 to 3 as the curve 
(1.4.3a). Indeed, the transition from 
Eq. (1.4.3a) to Eq. (1.4.3) in all re- 
spects is similar to the transition from 
the “unit” hyperbola (1.4.2a) to the 
“general” hyperbola (1.4.2) (with a pos- 
itive coefficient k ): curve (1.4.3) is ob- 
tained from curve (1.4.3a) by stretch- 
ing all the dimensions along the y 
axis a-fold (see Figure 1.4.4, where we 
have depicted parabolas y = x 2 and 
y = 2x 2 ). 

What will happen if a < 0? Consid- 
er an example with a = — 2, that is, 
the curve y = —2a: 2 . The dotted line 
(the one below the x axis) in Fig- 
ure 1.4.4 depicts this curve. The prop- 
erties of this curve are: 
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Figure 1.4.5 

1. y < 0 for arbitrary x =^= 0. The 
whole curve lies below the x axis and 
touches the x axis at the origin. 

2. The function has a maximum value 
at x = 0. The maximum is equal to 
0. On the graph the maximum is the 
uppermost point of the curve. 

3. The curve is symmetric about the y 
axis, just as in the case of a positive. 

Now let us consider a more general 
equation 

y = a (x — n) 2 . (1.4.4) 

We take a = 1 and n = 3. The respec- 
tive curve is depicted in Figure 1.4.5. 
This is the same parabola as y = x 2 
but it has been displaced rightward 
three units along the x axis. 

This simple fact is not usually real- 
ized as readily as it should be. If a 
function y — f (x) is given and we com- 
pare it with another function y = 
f (x — ri), the graph of the second 
function is shifted rightward from that 
of the first by n units (assuming n is 
positive). It is assumed here that in 
both cases / is one and the same func- 
tion. In our example, the symbol / 
denotes a squaring of the independent 
variable, that is, the quantity inside 
the parentheses: 

/ {x) = x 2 , f (— x) = {—x) 2 = x 2 , 
f(x- 2) = (s - 2) 2 , 

/ (x — n) = (x — ri) 2 , 
f (. x 2 ) = {x 2 ) 2 = x 4 , and so on. 

But why is the graph shifted to the 
right? We will go into this in more de- 
tail. Suppose the graph of the function 
y = f (x) has some kind of Character- 


Figure 1.4.6 

istic point x — x 0 (a kind of notch, so 
to say). For example, at this point the 
function may have, say, a salient point 
or a maximum or merely assume some 
definite value y 0 (Figure 1.4.6). Then 
that same value y 0 or the same salient 
point or the maximum will appear on 
the graph of the new function y ± = 
f (x — ri) when the independent var- 
iable in the function / is equal to the 
old value x Q , that is, at x — n = x 0 . 
This means that now the coordinates 
of the notch are x = x 0 + n, y 0 = 
/ (£ 0 ). It is clear then that any notch, 
as it were, moves together with the 
whole graph to the right: point (# 0 , y 0 ) 
on the first graph has corresponding 
to it on the second graph the point (x 0 + 
n, y 0 ) (see Figure 1.4.6 and the solid 
and dashed parabolas in Figure 1.4.5, 
where for the notch we can take the 
point x 0 = 0, y 0 = f (x 0 ) = 0 2 = 0). 

All this is very simple and elementary, 
but it is extremely important and the 
student should not merely learn it but 
fully comprehend the meaning of it. 
The first urge of most students is to 
say that when we replace y — x 2 with 
y = (x — 3) 2 the curve is displaced to 
the left because we subtract 3 from the 
value of x. It is well worth your time 
to make a detailed analysis of the exam- 
ples, which demonstrate this common 
error. 

Now we can state the general rules: 

1. The curve y = a (x — ri) 2 has the 
vertical line x — n for its axis of sym- 
metry. 

2. This curve, for a > 0, lies above 
the x axis and has a minimum y = 0 
at x = n, while for a < 0 it lies below 
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the x axis and has a maximum y = 0 
at x = n. 

Finally, there is yet another modi- 
fication of this equation that does not 
alter the shape of the curve. Let us 
consider the function 

y = a (x — n) 2 + m. (1.4 5) 

This curve (also a parabola) clearly 
differs from the preceding one (without 
m) solely in the vertical displacement 
by the quantity m. The position of the 
axis of symmetry of the curve remains 
unchanged; for a > 0 the function has 
a minimum at x = n and the value of 
the function at the minimum is equal 
to y = m (the minimum, together with 
the whole curve, was shifted by the 
amount m), while for a < 0 the point 
(x = n, y = m) is the point of maxi- 
mum of the curve. 

Two examples will suffice (Figure 
1.4.7): y = (x — 3) 2 + 2 and y = 
—(x — 3) 2 + 2. The axes of sym- 
metry in both parabolas are depicted 
in Figure 1.4.7 by dotted straight lines: 
the minimum point in Figure 1.4.7a 
and the maximum point in Figure 1.4.76 


lie at the intersection of the respective 
curve and its axis of symmetry. 

Removing the brackets in the expres- 
sion y = a (x — n) 2 + m yields 

y = ax 2 — 2 anx + an 2 + m . (1.4.5a) 

On the right-hand side of (1.4.5a) we 
have a polynomial of degree two, which 
in its most general form has the nota- 
tion 

y = ax 2 + bx + c. (1.4.6) 

This formula can be transformed as 
follows: 


y= a ( x 2 +^ x +i) 

~ a (* 2 +ir x+ -&)+ a (i-^) 

= a ( x +iY+( c -£)- 

Hence, 

ax * + bx + c = a (* + i ) 2 + (e - ■£) , 


which implies that curve (1.4.6) is also 
a parabola with an axis of symmetry 
x = —b/2a and a minimum point or 
maximum point (-— 6/2a, c — 6 2 /4a). 

Using the graph of a parabola, we 
can investigate the solution of a quad- 
ratic equation and the various cases that 
arise in this connection. We can ap- 
proach the solution of the quadratic 
equation 

ax 2 + bx + c = 0 


this way: consider the curve 


y — ax 2 + bx + c = a 




and find the points of intersection of 
this curve with the x axis. At these 
points we have y = 0, and so the val- 
ues of x corresponding to the points 
of intersection are the roots of the 
quadratic equation. 

But we know that the curve y = 
ax 2 + bx + c is a parabola. We 
also know that this parabola has an axis 
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Figure 1.4.8 

of symmetry, the vertical line x = 
— bI2a , and that for a > 0 the pa- 
rabola has a minimum point on the 
axis of symmetry and the altitude 
(ordinate) of this minimum is y = c — 
6 2 /4a (if we glance at the right- 
hand side of the last formula, we see 
that it has the customary form a (x — 
n) 2 + m). For a > 0, the limbs of 
the parabola point upward. 

It is clear that if the minimum lies 
above the x axis, the parabola does 
not intersect the x axis at any point 
(Figure 1.4.8, curve 1). This means that 
when a > 0 and c — 6 2 /4a > 0 sim- 
ultaneously, the quadratic equation 
has no roots. 1 - 6 But if the minimum 
lies below the x axis and the limbs of 
the parabola point upward, the para- 
bola will definitely intersect the x axis 
at two points; these points will be sym- 
metric with respect to the straight line 
x = n = — 6/2a, the axis of symmetry 
of the parabola (curve 2 in Figure 1.4.8). 
This means that when a > 0 and c — 
b 2 !ka < 0 simultaneously, the equa- 
tion has two roots x x and x 2 as shown 
in Figure 1.4.8. 

Finally, there may be an intermedi- 
ate case where the parabola touches the 
x axis (curve 3 in Figure 1.4.8). This 
case occurs when c — b 2 /4a = 0. If 
we gradually move curve 2 upward, it 
will finally coincide with curve 5, the 
two roots x 1 and x 2 will come closer to 
each other and, ultimately, at the 

1 * 6 Here by “root” we mean a real root of 
the equation; for the time being, until Chap- 
ters 14 and 15, we will simply ignore complex- 
valued roots. 


instant of tangency, will merge. That 
is why, when c — b 2 /Aa = 0 , we speak 
not of one root but of two equal (co- 
incident) roots of the equation. 

The case a < 0 is considered in a sim- 
ilar manner. The respective curve has 
a maximum and the limbs point down. 
The reader is advised to draw the curves 
and to verify that (a) when a < 0 and 
c — 6 2 /4a < 0 simultaneously, there are 
no real roots, (b) when a < 0 and c — 
b 2 /Aa > 0 , there are two real roots, 
and (c) when a < 0 and c — 6 2 /4a = 0, 
there are two equal roots (tangency). 

The ordinary formula for the roots 
of a quadratic equation is 

— b zb V b 2 — 4 ac 
x i,2— 2 a * 

The equation has two real roots when 
we are able to take a square root of 
b 2 — 4 ac, that is, when b 2 — 4 ac > 0. If 
we write this as 

b 2 — 4ac = — 4 a (c — >0, 

we see that this condition is satisfied 
when 

( 1 ) a> 0 , c — 7 — < 0 , and ( 2 ) a< 0 , 


These are the two cases of the existence 
of two roots, which were obtained ear- 
lier from a consideration of the curves 
corresponding to the function y = 
ax 2 + bx + c. 

We can approach the question of solving 
quadratic equations from a somewhat differ- 
ent angle. Let us divide all terms in the equa- 
tion ax 2 + bx + c = 0 (where, of course, 
a 0) by a. The result is 

x 2 + px + q = 0, or x 2 = —px — q , (1.4.7) 

where p = bia and q = c/a. Let us plot in the 
xy- plane two functions: 

Vt = * 2 (1.4.7a) 

(this is the “unit” parabola (1.4.3a)) and 
y 2 = — px — q (1.4.7b) 

(a straight line). Clearly, for the values of x 
that satisfy Eq. (1.4.7) we have y x = y 2 , 
that is, these values of x correspond to the 
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Figure 1.4.9 

points of intersection of parabola (1.4.7a) 
with straight line (1.4.7b). But parabola 
(1.4.7a) can be plotted with great accuracy 
(using, say, graph paper), so that to solve 
any quadratic equation it suffices to draw the 
straight line (1.4.7b) (say, finding any two 
points of the straight line and using a ruler) 
on the same graph paper. The roots can then 
be “read off” the paper. The straight line 
(1.4.7b) may intersect the parabola (1.4.7a) 
at two points (as line 1 in Figure 1.4.9 does) 
or touch it at a single point (as line 2 in Fig- 
ure 1.4.9; two coincident points of intersec- 
tion with the parabola) or even have no 
points in common with parabola (1.4.7a) (as 
line 3 in Figure 1.4.9). All this corresponds to 
cases where Eq. (1.4.7) has two distinct roots 
or two coincident roots (i.e. one root) or no 
roots at all. 

The aforesaid also enables us to conclude 
the following. We know that the equation 
ax 2 + bx + c = 0 has a single root if and 
only if c — b 2 /4a = 0 , whereby the equation 
x 2 + px + q = 0 (the case with a = 1) has 
a unique root if and only if 



Thus, (1.4.8) is the condition for tangency of 
the straight line (1.4.7b) and parabola (1.4.7a); 
in other words, the straight line y = kx + b 
touches parabola y = x 2 if and only if 1 * 7 

h*+4b = 0. (1.4.8a) 

In Chapter 2 we derive the same condition 
(1.4.8a) for tangency of a straight line and a 
parabola using another method. 

Observe, finally, that, depending on 
the sign of the coefficient a of x 2 in the 
equation of the parabola (1.4.6), the 
curve is convex down (a > 0) or convex 
up (a < 0). This property does not 
depend on the values and signs of b 

1 « 7 It is clear that if in (1.4.8) we substitute 
k for — p and b for — q , we arrive at (1.4.8a). 



Figure 1.4.10 

and c in the equation of the parabo- 
la (1.4.6). 

An exact definition of convexity is 
this: take two points A{x x , y x ) and 
B{x 2 , y 2 ) on a curve and draw a line 
through them (Figure 1.4.10). If the 
portion of the curve between the two 
points lies below the straight line (be- 
low the chord AB of the curve), we say 
that the curve is convex down , while 
if the portion of the curve between the 
two points lies above the straight 
line (above chord AB ), we say that the 
curve is convex up. 

The convexity of a parabola is readily 
seen in a drawing, but we can also 
define it algebraically. Take arbitrary 
values x x and x 2 of the abscissa x. They 
are associated with points on the pa- 
rabola, A(x i, y x ) and B(x 2 , yz), where 
y x = ax{ + bx ± + c and y 2 = ax\ + 
bx 2 + c. 

We wish to find the coordinates of 
point M lying at the midpoint of the 
line segment AB (Figure 1.4.10). It 
may be demonstrated geometrically 
that if AM — MB , the coordinates of 
point Af(x 0 , y 0 ) are arithmetic means 
of the coordinates of A and B : 

= and „ 0 «a+L’. 

(This follows from the fact that in 
Figure 1.4.10 the length of the mean- 
line Mx 0 of the trapezoid ABx 2 x x is 
equal to one-half of the sum of the 
lengths of the bases Ax x and Bx 2 of thu 
trapezoid.) Now let us find the coordi- 
nates of the point N(X, Y) lying on 
the parabola for the same valun 
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= fa + x 2 )l 2 of the abscissa. Here 


Exercises 

1.4.1. Plot the following curves: y — 3/ar, 
y = — 0.5 lx, and y = 1/x + 3. 

1.4.2. Plot the following curves: y = 
x 2 — 2x + 2 and y = 2x 2 + Ax. 

1.4.3. Show that the hyperbola y = k/x, 
where k is positive, can be obtained from the 
hyperbola y = 1/x by (a) a stretching transfor- 
mation from the origin O (a homothetic trans- 
formation) with a coefficient Y k (for k < 1 
it is more appropriate to speak of a shrinking 
transformation), which means that if M' and 
M are points belonging to the curves y = k/x 
and y = 1/x and lying on the straight line 
that passes through the origin 0 , then OM' ~ 
OM = Yk\ and (b) a stretching transforma- 
tion along the x axis (or away from the y axis) 
with a coefficient k (here, too, for k < 1 it is 
more appropriate to speak of a shrinking 
transformation). 

1.4.4. Prove algebraically that for k 0 
the hyperbola y — k/x is convex down when 
x I> 0 and convex up when x < 0. 


Y = aX 2 +bX + c 
= a (^l ±^) 2 + &£l±^ + c. 

The reader can assure himself that 

Y-j 1 -«(a±^) 2 -(a4 + «T') 

— W 


1.5 Higher-Order Parabolas and 
Hyperbolas. The Semicubical Parabola 

The curve given by the graph of the 
function 

y = ax n (1.5.1) 

(where n is a positive integer) is often 
called a parabola of order n. For in- 
stance, Figure 1.5.1 depicts a third- 
order parabola (or a cubical parabola) 


{since the terms involving b and c 
cancel out). But the quantity ^ 1 j 

is positive for arbitrary x x and 
x 2 . Consequently, for a > 0 we have 
Y < y 0 , or the point on the parabola 
lies below the corresponding point on 
the straight line (i.e. having the same 
abscissa X ), that is, the parabola is 
convex down. On the other hand, for 
a < 0 we have y 0 < Y, and the pa- 
rabola is convex up. 

The hyperbola y = k/x (where we 
nssume that k is positive) consists of 
two branches. The reader can clearly 
see that the branch of the hyperbola 
corresponding to positive x's is convex 
<1 own . w 7 hile__tbe .second. _hr anrJi_ nf the . 
hyperbola is convex up (see Fig 
ure 1.4.11 and Exercise 1.4.4). 


(1.5.2) 


(1.5.3) 


y = x* 

and a fourth-order parabola 

y = £ 4 - 

We see that a fourth-order parabola 
(1.5.3) resembles an ordinary (second- 
order) parabola: it has a symmetry axis 
(the y axis), at point (0, 0) it has a 
minimum (the only one), and at the 
origin 0 it touches the x axis that does 
not intersect it (just as the parabola 
V = z 2 )- 

The cubical parabola y = x? po- 
ssesses quite different properties. It has 
neither maxima nor minima: an increase 
in x always brings about an increase 
in y, that is, as a point moves from 
Jftft_to ri^ht^^on Abe^cjiryn^ . it always^ ~ 
rises. It is said that the function (1.5.2) 
increases for all values of x (what we 
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Figure 1.5.1 

mean by this is that the function in- 
creases with the independent variable 
and that this behavior does not change). 
Curve (1.5.2) has no axis of sym- 
metry: the values of y in this case are 
negative for negative x and positive 
for positive x. Instead it has a center 
of symmetry , the origin 0 (0, 0). In- 
deed, if we take the curve (1.5.3), we 
can see that to every two values of the 
independent variable that differ in sign 
but not in absolute value, x 0 and — x 0 , 
there corresponds only one value y 0 = 
xl of the function (or two coinci- 
dent values), that is, to each point 
A(x 0 , y o) 011 curve (1.5.3) there cor- 
responds a point A ± ( — x 0 , y 0 ) sym- 
metric about the y axis (Figure 1.5.16). 
Now if we take the function (1.5.2), 
we see that to every two values of the 
independent variable that differ in 
sign but not in absolute value, x 0 and 
— x 0l there correspond two values of 
the function that differ in sign but not 
in absolute value, namely, y 0 = x\ 
and — y 0 = — x s 0 , so that to each point 
B(x o, y 0 ) on the parabola (1.5.2) there 
corresponds a point B x (— x 0 , — y 0 ) sym- 



metric with respect to the origin (9, 
which evenly divides the line seg- 
ment BB l into two parts (Figure 1.5.1a). 

Figure 1.5.2 depicts a curve defined 
by the formula 

y = + x. (1.5.4) 

This curve also has the distinguishing 
feature that on any portion of it an 
increase in x brings about an increase 
in y and the curve constantly rises 
from left to right, just as the function 
y — x 3 does. The curve (1.5.4) has 
neither maxima nor minima. Quite ob- 
viously, such a curve cuts the axis of 
abscissas only once, at point 0 . 

Figure 1.5.3 shows a curve construct- 
ed on the basis of the formula 

y = o? — x. (1.5.5) 

As is evident from the graph, this curve 
has two portions where y increases with 
x: for negative x <Z —0.58 and for 
positive x > 0.58. Between them, on 
the interval — 0.58 < x < 0.58, the 
function is decreasing: y decreases as x 
grows. 



Figure 1.5.3 
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Figure 1.5.4 

The function (1.5.5) has a maximum 
at (x ~ —0.58, y ~ 0.38). 1 - 8 In 
this context, the word “maximum” 
does not mean that in the given case 
y ~ 0.38 is the greatest possible value 
of y given by the expression (1.5.5), 
since it is clear that for large positive 
values of x the quantity y will assume 
arbitrarily large values. So what is so 
conspicuous about the maximum point 
{x —0.58, y ~ 0.38) of the function 
(1.5.5)? 

As the graph of this function shows, 
at this point y is greater than at ad- 
jacent points. The point of maximum 
separates the portion of the curve where 
the function is growing (to the left of 
the maximum) from the portion where 
the function is decreasing (to the right 
of the maximum). This is what is 
called the local (or relative) maximum: 
the value of y at this point is greater 
than the values of y at other points, but 
only for values of x that are not too 
far from x max (in our case x max ~ 
— 0.58). Similarly, at point (x ~ 
+0.58, y ~ — 0.38) the function has 
a local (relative) minimum. 

In Figure 1.5.4 we have two more exam- 
ples of curves describing polynomials of 

1 * 8 In Section 2.6 we will learn how to 
find the maxima and minima of a function. 
For example, in the case of formula (1.5.5) 
we will find that £ max — —UY ' 3 ~ 0.57735, 
i/max = 2/(3/3) ~ 0.3849. 


order three. The cubic equation that 
we get by equating the respective 
polynomial to zero has one (real) solu- 
tion (or root), x ~ 0.48, in the case of 
the upper curve, and three roots, x x = 1, 
x 2 = 2, and x 3 = 3, in the case of the 
lower curve. It is easy to see that a cu- 
bic equation always has at least one real 
root: to be sure of this, the reader is ad- 
vised to examine the behavior of the 
curve y = ax 3 + bx 2 + cx + d for very 
large (in absolute value) positive and 
negative values of x. 

The graph of the cubical parabola (1.5.2) can 
be used to solve (approximately) an arbitrary 
equation of degree three. Let us write the 
general equation of degree three (the cubic 
equation) as 

x s + 3 ax 2 + bx + c = 0 (1.5.6) 

(we will find it convenient to denote the coef- 
ficient of x 2 by 3 a rather than by a). We trans- 
form Eq. (1.5.6) thus: 

(x 3 + 3 ax 2 + 3 a 2 x + a 3 ) + (b — 3a 2 ) x 
+ (c~ a 3 ) = 0, 
or 

(x + a) 3 — (3a 2 — b) (x + a) 

— [(a 3 — c) — a (3a 2 — 6)] = 0, 

or, finally, 

X 3 — KX — B = 0, (1.5.6a) 

where X = x + a, K = 3 a 2 — b , and B = 
ab — c — 2a 3 . 

Clearly, solving Eq. (1.5.6) is the same as 
solving Eq. (1.5.6a), which can also be re- 
written as 

X 3 = KX + B. (1.5.7) 

Hence, if we can sketch the cubical parabola 
y =1° wiin gre'ai accuracy qsay; using gra^n 
paper), we need only construct on the same 
graph the straight line y 1 = KX + B and 
find the points (or point) of intersection of the 
straight line with the cubical parabola— the 
result is the solution, or the approximate val- 
ues of the roots of Eq. (1.5.6a) (and hence of 
Eq. (1.5.6)). 

For example, in the case of the equation 
x 3 — 6z 2 + llx — 4 = 0 (see Figure 1.5.4) 
we have a = — 2, b = 11, c = — 4, whence 
# = 3a 2 — 6 = 3x4 — 11 = 1 and B = 
ab — c — 2a 3 =-2Xll+4 — 2X (—8) = 
—2. Thus, to find the solutions to our equation 
we need only find the points of intersection of 
the parabola y = X 3 and the straight line 
y 1 = X — 2. Figure 1.5.5, yields the approx- 
imate solution: X ~ —1.5, whence x = 
X — a ~ —1.5 + 2 ~ 0.5. 
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figure 1.5.5 

Similarly, in the case of the equation 
x 3 — 6x 2 +H x — 6 = 0 (see Figure 1.5.4) 
we have a — —2, b — 11, c = —6, K = 
3a 2 — 6 = 1, and B — ab — c — 2a 2 = 0. Here, 
obviously, the points of intersection of the 
cubical parabola y — X s with the straight 
line y 2 = X correspond to values of X that 
are —1, 0, and +1, which implies that the 
roots x = X — a = X + 2 of the initial 
equation are equal to 1, 2, and 3 (see Fig- 
ure 1.5.5). 

The curve that represents the func- 
tion 

y= (1-5.8) 

where n is a positive integer, is some- 
times called a hyperbola of order n + 1 
or a hyperbola of degree n. For example, 
Figure 1.5.6 depicts a hyperbola of 
degree two 

(1-5-9) 



Figure 1.5.6 


and a hyperbola of degree three 

y=fs- (i.5.io) 

As one can easily understand, curve 
(1.5.10) resembles an ordinary hyper- 
bola (1.4.2a): both consist of two 
branches (symmetric about the origin O), 
and at x = 0 the value of y is +oo 
or — oo (compare the results of Sec- 
tion 1 .4 concerning the hyperbola (1 .4.2a) 
with this result). On the other hand, the 
hyperbola (1.5.9) differs from an ordi- 
nary hyperbola, since here y is positive 
for x both positive and negative. This 
curve also consists of two branches, 
which are symmetric about the 
straight line x = 0 (the y axis) rather 
than about O. For the function y = l/x 2 
we can symbolically write y = +oo 
at x = 0 (this means that for small 
values of x the quantity y may become 
as large as desired). 

The graph of the function 

y = ±x>t 2 -=±V^ 3 , (1.5.11) 

or, more precisely, the curve with the 
equation 

y 2 = x 3 (1.5.11a) 

is known as a semicubical parabola (Fig- 
ure 1.5.7). Since no values of y cor- 
respond to negative values of x (for 
negative x’s the right-hand side of 
(1.5.11a) is negative, which is impos- 
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sible since y 2 is never negative), the 
entire curve lies in the right half- 
plane. Since to each value of x there 
correspond two values of y that differ 
in sign but not in absolute value, pre- 
cisely, +]/ x? and — y^ 3 , the curve 
is symmetric with respect to the x axis: 
to each point on the curve, A(x Q , y 0 ), 
there corresponds a point symmetric 
about the axis of abscissas, A x (^ 0 , — y 0 ); 
below, in Section 2.5, we prove with 
sufficient rigor that the semicubical 
parabola at the origin 0 touches the x 
axis. 

These examples illustrate the behavior of 
all power functions , that is, functions of the 
type 

y = x n , (1.5.12) 

where the exponent n may be positive or non- 
positive, greater or smaller than unity in 
absolute value, an integral or a fractional 
number. 

If n> 1, the curves representing (1.5.12) 
behave like the quadratic parabola (1.4.3a) 
(we will consider only nonnegative values of x 
since the very definition of the quantity x n 
with x negative and n a noninteger presents 
certain difficulties; for instance, x 1 ! 2 = V x 
does not exist for x negative). All curves of 
this type touch the x axis at the origin 0 and 
pass through point Q (1, 1), and the higher 
the value of n, the closer the corresponding 
curve (1.5.12) is to the x axis in the vicinity of 
the origin and the steeper is the curve in the 
vicinity of point Q. In the interval 0 < n < 1, 
on the other hand, curves (1.5.12) touch the y 
axis, and the smaller the value of n , the clos- 
er they are to the y axis; these curves also 
pass through point Q (1, 1), and the smaller 
the value of n , the sharper is their turn away 
from the origin to this point (see Figure 1.5.8a, 
which shows the different curves, which cor- 
respond to different positive values of n in 
(1.5.12), and the straight line y = x, which 
corresponds to n = 1 and separates the 
curves (1.5.12) with n < 1 from those with 
n > 1). 

The curves (1.5.12) with n negative behave 
quite differently. These curves, specified by 
the equation 

y = *-”»= A-, m>;0, (1.5.12a) 

also pass through point Q (1, 1), but, in con- 
trast to the case where n is positive, they do 
not enter the square CAQB with the diagonal 
OQ and lie completely outside it. The x and 
y axes are the asymptotes to these curves, and 



A 

(b) 


Figure 1.5.8 


the greater the absolute value of n, that is, 
the greater the positive number m = — 7z, the 
faster these curves tend to merge with the x 
axis (a complete merge never happens, how- 
ever) and the slower they approach the y axis 
(see Figure 1.5.86, where we have depicted 
curves (1.5.12) with the following values of 
n: —1/10, —1/3, -1/2, -1, —2, —3, -10, 
that is, m = 1/10, 1/3, 1/2, 1, 2, 3, 10 in 
Eq. (1.5.12a) and the “limit line” y = x° = 1 
corresponding to n — 0). 

Note also that through each point M (x, y) 
of the first quadrant (i.e. points with x and y 
positive) for which x =^= 1 there passes only 
one curve (1.5.12); ii x — 1 and y — 1 have 
the same sign, that is, if x > 1 and y > 1 or 
x < 1 and y < 1 , the value of n for this 
curve is positive, while if x — 1 and y — 1 
have different signs, that is, if one of the 
numbers x, y is greater than unity and the 
other is smaller than unity, then n < 0 for 
this curve. As for the points N (1, y) with 
y > 0, not a single curve of (1.5.12) 
passes through such a point N that differs 
from Q, while all curves specified by (1.5.12) 
pass through point Q. 

As for the behavior of the curves specified 
by (1.5.12) at great absolute values of n , see 
Exercise 1.5.3. (Mathematicians love to speak 
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of asymptotic behavior, or simply asymptot- 
ics, of the functions (1.5.12) when n oo 

or 72 — OO .) 9 * 1 


Exercises 

1.5.1. Construct the curves given by the 
equations (a) y = x 4 + 1 , (b) y = — x 4 + 1 , 
and (c) y = x 4 + x 2 . 

1.5.2. Construct the curves given by the 
equations (a) y = — x~ 3 + 1 and (b) y = 
— x~ 2 -f- 4. 

1.5.3. What shape have the curves 
(a) y = x 2n , (b) y = z 2n+1 , (c) y = x~ 2n , and 
(d) y = (a: 2n+1 )~ 1 for very large values of the 
positive integer n? 


1.6 The Inverse of a Function. 

Graphs of Inverse Functions 

By fixing a quantity y as a function of 
another quantity x we mean that to 
each x there is assigned a definite val- 
ue of y. But this dependence can be 
inverted, namely, we can fix y and then 
find x. For example, the law of motion 
of a train traveling from one station 
to another can be fixed by specifying 
the position z of the train at every mo- 
ment t , where z can be, say, the distance 
traveled from the initial station. This 
fact, of course, can be stated mathe- 
matically as z = / (£), and the engine- 
driver uses the function in his time- 
table. But a passanger is more inter- 
ested in the “inverse” timetable, the 
dependence of the time t x at which the 
train will be at a specified station de- 
termined by the coordinate z x . The de- 
pendence t = g (z) is called the inverse 
of z = f ( t ) or, in other words, g is a 
function that is inverse to f. Note that, 
of course, / is the inverse of g. 

Here are some examples, where the 
left column "lists the function y = 
y(x) and the right column the func- 
tion x = x (y) (in the last instance the 
independent argument is denoted by y 


x * 9 The word “asymptotics” is of the same 
origin as “asymptote” since the statement that 
the function y = i/x has two asymptotes 
y — 0 and x = 0 characterizes the behavior 
of this function for small and large absolute 
values of x. 



and the function by x contrary to tra- 
dition): 


y--=x + a, 
y~ 3a: + 2, 
y = i — x, 
y = x 2 , 

y = x 3 + i, 


x~y — a\ 

1 

x =jy- 

x — 1 — y] 

* = ± V y 

X= fy ~ 


(1.6.1> 


It is easy to grasp how the graphs of 
the functions / and g are connected. 
Suppose that we have the graph of tho 
function y = / (x) (Figure 1.6.1). For 
this graph to represent the function 
x = g (t), we must view it at a dif- 
ferent “angle”, precisely, the y axis 
must be considered the axis of ab- 
scissas (the independent variable) and 
the x axis the axis of ordinates. If wo 
wish the axis of the independent vari- 
able to remain horizontal and the 
axis representing the values of the func- 
tion vertical, we must rotate the graph 
together with the coordinate axes 
through 180° about bisector of the first 
and third quadrant angles (the straight 
line l). After such a rotation the y axis be- 
comes horizontal and the x axis vertical. 
The result of such a rotation is depicted 
in Figure 1.6.1 by a dashed curve, 
and the continuations of the x and y 
axes are also depicted by dashed seg- 
ments to stress the fact that the axes 
have changed places. 

In Figure 1.6.2 the two parts of Fig- 
ure 1.6.1 (the solid curve and the 
dashed) have been separated; in Fig- 
ure 1.6.26 the abscissas are denoted by 
x and the ordinates by y (as is custo- 
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mary). We see that while in Fig- 
ure 1.6.2a the value a = OA of the inde- 
pendent variable corresponds to 6 = 
OB of the function (where b = f (a)), 
in Figure 1.6.26, depicting the graph 
of g, we have a — g (6). A rotation 
through 180° about the bisector l maps 
the segments OA = a and OB = b of 
Figure 1.6.2a into the segments OA x — 
a and OB 1 = 6 of Figure 1.6.26. 
Thus the graphs of two functions each 
of which is the inverse of the other are 
symmetric about the bisector l of the 
first and third quadrant angles. For exam- 
ple, Figure 1.6.3 depicts graphs for a 
pair of such functions: y = 3x + 2 
and y = x/3 — 2/3. In particular, if 
the graph of a function is symmetric 
about the bisector l the function coin- 
cides with its inverse. Such are, say, 
the functions y = 1 — x and y = 1 lx 
(see Figure 1.4.2). Indeed, y = 1 — x 
implies that x = 1 — y, while y = 
Mx implies that x — My. 

Let us now turn to Figure 1.6.4, 
which depicts the graphs of the func- 
tions y = x 2 (the solid curve) and y = 
y x (the dashed curve). The reader 
can see that the function y = ±Y x, 
which is the inverse of the function 



»-L x -2 

9 3 * 3 


cc 



Figure 1.6.4 

y — x 2 , is two-valued: to each posi- 
tive value of the independent variable x 
there correspond two values of the 
function, y = ]/ x and y = — Y x. On 
the other hand, there are no values of y 
corresponding to negative values of x , 
since the graph of the function y = 
±Y x ^ es entirely in the right half- 
plane corresponding to x positive. There- 
fore, the statement that to each va- 
lue of x there corresponds a value of 
the inverse function y will be incor- 
rect in the given case, since there are 
values of x (x < 0) for which no val- 
ues of y exist, while for other values of 
x (x > 0) we have even an excess of 
values of y (two values of y for each 
value of x > 0). 

The situation with the curve of a func- 
tion / ( x ) depicted in Figure 1.6.5 is 
even more complicated (the dashed curve 
corresponds to the graph of the in- 
verse function g). Here to a value of, 
say, x — a there correspond four val- 



Figure 1.6.5 


Figure 1.6.3 
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ues of g (. x ), namely, 6, 6 1? b 2 , and 
b 3 . This, however, is not surprising — 
there is no rule by which a function 
that is the inverse of a given function 
/ (x) must be single-valued. If we take 
the example of the train discussed above, 
the points (stations) through which the 
train passes (the position of each point 
is defined by the distance z from the 
initial point 0) are determined unique- 
ly by the value of t (time), and it 
is this dependence z = z (t) that is 
given in the timetable for the engine- 
driver. But if a given suburbian train 
does several runs a day, then it passes 
a certain point on its trip back and 
forth several times in the course of the 
day, which implies that the inverse 
dependence of time on distance, t = 
t ( 2 ), is multiple-valued — to one value 
of z (the distance of a point from 
the initial point) there correspond sev- 
eral moments in time when the train 
is exactly at a point distant 2 from 0. 

So where does the difference lie betwe- 
''JBifcF'i&n.tviUii L r 0 ^ ^secPr Tg- 
ure 1.6.3; here the inverse function is 
single-valued) and a function like y = 
x 2 (see Figure 1.6.4; the inverse func- 
tion is two-valued) and, all the more, 
a function like the one depicted in 
Figure 1.6.5 (the inverse function is 
multiple-valued)? The answer is ob- 
vious. The inversion, so to say, of a 
function, that is, finding the inverse 
of a given function, consists in recon- 
structing the values of abscissas from 
the respective values of the ordinates, 
say, reconstructing the time t from 
the distance 2 traveled by a train. We 
are lucky if to each value of the func- 
tion y there corresponds exactly one 
value of the independent variable x , 
that is, if the inverse function is single- 
valued, but such good “behavior” rare- 
ly happens. However, an inverse func- 
tion is always single-valued if the ini- 
tial (single-valued) function y — f (x) 
is monotonic , that is, it either always 
increases or always decreases. Here, as 
the independent variable x changes, 
the function runs through new values 
of y and to each such value of y there 


corresponds only one value of x . For 
example, there is no difficulty in deter- 
mining the inverse of y = 3x + 2, since 
this (linear) function increases mono- 
tonically. The same is true for the 
dependence of the distance traveled by 
a train on the time, 2 = / (£), if the 
train moves without stops and in one 
direction. But if the direction in which 
the function changes is reversed (say, 
the train starts moving in the opposite 
direction), for example, the function 
first decreases, as y = x 2 , and then, 
after passing through the minimum 
point, begins to increase, each value 
of the function is passed a second time, 
and therefore we cannot guarantee that 
it is a certain value of x that corre- 
sponds to a given value of y, since there 
can be several values of x correspond- 
ing to a single value of y (two values 
in the case of the function y = x 2 ). 
Finally, if we take the function y = a 
(see Figure 1.3.6), there is no way in 
which we can define the inverse, since 
T, i / -ni, t n ndu , ^nence, 

cannot be reconstructed from y\ in 
other words, the function y — a has 
no inverse. 

In the case of other functions the si- 
tuation is not as bad as one would think. 
If we wish to construct a single-valued 
function that is the inverse of, say, 
y = x 2 , we need only confine ourselves 
to one of the monotonic parts of this 
function, say consider only positive 
values of the independent variable x 
(in Figure 1.6.4 this corresponds to the 
part of the curve y = x 2 lying in the 
first quadrant). This mono tonic sec- 
tion of the function has a single-valued 
inverse (the graph of the single-valued 
inverse function y = Y x correspond- 
ing to this section is depicted by the 
dashed curve lying in the first quad- 
rant of Figure 1.6.4; it is this func- 
tion that is known as the arithmetic 
value of the square root of x and is de- 
noted by ]/’a:). 1 * 10 Similarly, if only 

1 * 10 One must bear in mind, however, 
that the quantity + Vx, where by }/" x we 
mean the arithmetic value of the root is not 
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a point M (x 0 , y 0 ) on the graph is not a 
maximum or a minimum of a func- 
tion y = f (x), like point M in Fig- 
ure 1.6.5, then in the neighbourhood of 
point M a monotonic interval can be 
specified. 1 * 11 To this interval there cor- 
responds a single-valued branch of the 
inverse function g (see the parts of the 
solid curve and dashed curve from point 
0 to point B in Figure 1.6.5). In gen- 
eral, a multiple-valued function, like 
the function y = g (x) depicted in Fig- 
ure 1.6.5 by a dashed curve, usually 
splits into separate single-valued 
branches (in our case these are the 
branches depicted in Figure 1.6.5 by arcs 
OB,BC , CD , and 29 i?). The difficulty here 
may arise only when we wish to choose 
the “principal” branch, but from the 
viewpoint of mathematics this ques- 
tion is irrelevant. 


Exercises 

1.6.1. What are the inverses of (a) y = 
2x + 4, (b) y — x 2 — 2, (c) y = x z + 3x 2 + 
3x, and (d) y = x A + 2z 2 ? 

1.6.2. Employ Figures 1.5.2, 1.5.3, 1.5.6a, 
and 1.5.66 to construct the graphs of the 
functions that are the inverse of (a) y — 
x B + x , (b) y = x s — x, (c) y — x~ 2 = i/x 2 , 
and (d) y = x~ s = 1/a: 3 . Separate, where pos- 
sible, the intervals on which the functions are 
monotonic, which permits arriving at single- 
valued inverse functions. 

1.6.3. Find the functions that are the 
inverses of (a) an arbitrary linear function 
y = ax + 6, (b) an arbitrary quadratic func- 
tion y ~ ax 2 + bx + c, and (c) the func- 
tion y = {ax + b)/(cx + d ). 

1.6.4. Suppose that / (x) and g {x) are two 

functions each of which is the inverse of the 
other, with / defined on the interval a c 
x c 6 and g on the interval where 

/ (a) = a and / (6) = p and the function y = 
f (x) increases monotonically on the interval 
a < x c 6. Prove that (a) / (g (x)) = x, 

g (/ (a:)) = x, and (b) the functions F {x) — 
f (/ (x)) and G (x) = g (g (x)) constitute ano- 
ther pair of such functions. What are the do- 
mains of these two functions? 


the complete inverse of y = x 2 , only the two- 
valued function y = + V x is the exact 
inverse of y = x 2 . 

1 - 11 In this connection one speaks of the 
function y = / (x) as being locally monotonic 
at point M, that is, locally monotonic in 
a certain neighbourhood of point M. 


1.7 Transforming Graphs of Functions 

Above we repeatedly encountered the 
problem of transforming graphs, that 
is, the transformation of one graph 
into another graph that is similar to 
the initial graph. We will now return 
to this problem having in mind a more 
systematic approach. 

The simplest case is the translation 
of a curve. Suppose that we have the 
graph of a function y = f (x) and we 
would like to know the shape of the 
curve obtained through shifting this 
graph b units upward. The answer is 
almost obvious: it is clear that if we 
consider two curves, y — f {x) and y = 
f (x) + 6, then to each point A(x Q , y 0 ) 
of the first graph there corresponds 
a point A x (x 0 , y ± ) of the second graph, 
which lies b units above point A (see 
Figure 1.7.1, with b positive). Putting 
it differently, we can say that if we 
have two functions, y — f(x) and 
y — b = f (x), then the graph of the 
second is b units higher than the graph 
of the first function, that is, replacing y 
with y — b in the equation of a curve 
is equivalent to shifting the curve b 
units upward. 

It could seem that there is no sense 
in formulating the same statement in 
two different ways that differ very 
little from each other, namely, lift- 
ing a curve means performing the fol- 
lowing transformation of the equation 
of the curve: go over from y = f (x) ta 
y = f (x) + b or from y = / {x) to 
y — b = f (x). However, the second for- 
mulation, which involves replacing y 
with y — b is more convenient than 
the first when the curve is specified 
implicitly, that is, by an equation of 
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the type F (, x , y) = 0, which is not 
resolved for y. For instance, the equa- 
tion of a circle S of radius 1 with cen- 
ter at the origin is conveniently writ- 
ten as 

x 2 + y 2 = l, 
or 

F (x, y) = * 2 + y 2 - 1 = 0 (1.7.1) 

(compare with formula (1.2.1) for the 
distance OA , with A = A(x, y)). Re- 
placing y with y — b, we get 

x 2 + (y-b) 2 - 1 = 0, 

which is the equation of a unit circle cen- 
tered at the point Q 1 (0, 6), an equa- 
tion obtained from S by shifting S up- 
ward by b units (Figure 1.7.2). 

Similarly (as was discussed in great 
detail in Section 1.4), replacing in the 
equation F(x , y) = 0 the independent 
variable x with x — a is equivalent to 
shifting the curve a units to the 
right. For instance, the equation (x — 
a) 2 + y 2 — 1=0 defines a circle of 
radius 1 centered at point Q 2 ( a , 0), 
that is, an equation obtained from S 
by shifting S rightward a units (see 
Figure 1.7.2). 

It is clear that the quantities a and 
b may be negative; for instance, trans- 
forming the equation y = / (x) into the 
equation y = f (x) — b (where, as usu- 
al, b is positive) or replacing y in the 




Figure 1.7.3 

equation of the curve with y + b means 
that we shift the curve downward b 
units (the dashed curve in Figure 1.7.1). 
Similarly, the curve specified by the 
equation F (x + a, y) = 0 is obtained 
from the curve F(x , y) = 0 by shift- 
ing the latter a units to the left . 
For instance, the centers of the cir- 
cles x 2 + (y + b) 2 — 1 = 0 and (x + 
a) 2 + y 2 — 1 = 0 are the points 

Q 3 (0, — b) and <? 4 (—-a, 0) (see Fi- 
gure 1.7.2). These translations may all 
be combined; for instance, the curve 
F(x + 1, y — 2) = 0 is obtained from 
the curve F(x , y) = 0 by shifting the 
latter one unit to the left and two 
units upward (Figure 1.7.3). 

Example. Let us consider the linear- 
fractional function 

c *°- t 1 - 7 - 2 ) 

Many experimentally established re- 
lationships obey, either exactly or ap- 
proximately, this function, whereby it 
is important to know how to simplify 
the function and construct its graph. 

It is clear that if the numerator and 
denominator of the fraction on the 
right-hand side of (1.7.2) are such that 
a -7- c — b — d, or ad = be, the func- 
tion specified by (1.7.2) is a constant, or 
y = k, where k = ale = bid ; whereby 
the only interesting case is where ad 
be. We state that in this case the 
graph of the function (1.7.2) is a hyper- 
bola and, hence, the function (1.7.2) 
represents inverse proportionality of 
two quantities. 

We leave it to the reader to examine 
the general case (see Exercise 1.7.6) and 
consider the particular case 
19 — Gx 

y 


4 * 


2x — 5 * 


(1.7.2a) 
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Figure 1.7.4 

In the right-hand side we isolate the 
integral part: 

_ (15 — 6*) + 4 15 — 6x 4 

y ~~ 2x — 5 “ 2x — 5 + 2x — 5 


The last equation can be rewritten 
thus: 

!' + 3 =^7T’ d' 7 - 2b > 

from which we see that the curve (1.7.2a), 
or (1.7.2b), is the graph of inverse pro- 
portionality (with coefficient 2) be- 
tween y + 2 and x — 5/2 (Figure 1.7.4). 
This completes the solution of the 
problem. 

Now let us see how the equation of 
a curve must be changed so that all 
vertical dimensions (along the y axis) 
are increased c-fold. 112 Obviously, in 
place of the equation y — f {x) we must 
take a new equation y = cf (x) (we will 
write y 0 = f (x) and y x = cf {x), since 
these are two different curves). Then for 
the same x the quantity y t will be c 
times greater than before, that is to 
say, c times z/ 0 , and the curve will be 
stretched in the vertical direction en- 
fold. 

As an example, recall the equations 
of straight lines passing through the 
origin. The equation of the bisector of 


1,12 For the sake of simplicity, we from 
now on assume that c is greater than unity; 
the case with c less than unity (and even 
negative) we discuss below. 


the first and third quadrant angles is 
y 0 = x. The equation y x = 10# corre- 
sponds to a straight line that is more 
steeply slanted: for a given x the ordi- 
nate is 10 times greater (see Fig- 
ure 1.3.4). 

The law by which y 0 — f (x) is trans- 
formed into y x = cf (x) may also be 
described thus: in the equation of the 
curve z/ 0 = / ( x ) replace y 0 by yjc, 
that is, write yjc = / (x). Then the 
dependence of y x on x (the new y) is char- 
acterized by the fact that the curve 
z/i (x) is elongated c-fold vertically as 
compared to the curve y 0 (#) (the old z/). 

Again the need for two formulations 
whose equivalence is quite obvious (in- 
deed, it is clear that the equations y = 
cf (a?) and ylc = / (x) are the same 
equations) follows from the fact that 
it is more convenient to replace y with 
ylc if we are forced to “stretch” c-fold 
away from the x axis a curve whose equa- 
tion F (x, y) = 0 is not resolved for y. 
J ust substitute ylc for y in the new equa- 
tion, and we have the sought result: 
F(x , ylc) = 0. 

Let us again turn to the unit circle S 
whose equation is x 2 + z/ 2 — 1 = 0, 
that is, a unit circle centered at the 
origin. How would you write the equa- 
tion of the curve obtained by stretching 
the circle along the vertical axis 3-fold 
(Figure 1.7.5; the new curve is denoted 
by z/i (#))? By the rule that we have 
just stated, in the equation of the cir- 



Figure 1.7.5 
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cle we must replace y with y! 3. This yields 

* 2 +(i r ) 2-1==0 ll - 7,3) 

(here we write y 1 instead of y to distin- 
guish between curve (1.7.3) and circle 
(1.7.1)). 

The curve obtained as a result of 
stretching the unit circle away from 
the horizontal (or vertical) diame- 
ter (see Exercise 1.7.5) is an example 
of an ellipse . 1 13 We have therefore 
transformed circle (1.7.1) (in Fig- 
ure 1.7,5 this cicrle is denoted by y 0 (#)) 
into an ellipse, (1.7.3). 

In the given case Eqs. (1.7.1) and 
(1.7.3) can easily be solved: y 0 = 
Y i — x 2 and y x = 3 Y 1 — x 2 , which 
clearly shows that y ± = 3 y 0 for equal x . 
But the rule, or statement, by which 
replacing y with ylc leads to a c-fold 
stretching of the curve along the ver- 
tical axis holds true also for curves de- 
fined by a complicated equation 
F(x, y) = 0, that is, an equation that can- 
not be resolved algebraically for y , say 

* + y log y = 0. (1.7.4). 

The curve that results from stretching 
curve (1.7.4) 3-fold along the vertical 
axis is described by the following equa- 
tion 

x + J y log (|-) = 0. (1.7.4a) 

The statement concerning the re- 
placement of y with ylc is readily ex- 
tended to the ^-coordinate. When we 
replace x 0 with xlc in the equation of 
the curve, F ( x 0 , y) — 0, that is, when 
we go over to the equation F(xjc,y) = 0, 
the initial curve stretches along the 
x axis c-fold, which is to say, for 
equal y the value of x 1 is c times the 
value of Xq. 

We begin with examples instead of 
a proof: y = x 0 and y = x x H0 = 0Ax x 

3 * 13 The stretching coefficient, or ratio, 
may be equal to 1 (of course, such an identity 
transformation does not change the shape of 
a curve); accordingly with this a circle is 
usually considered as a particular case of an 
ellipse. 



Figure 1.7.6 

(see Figure 1.3.4). The first straight 
line slants at an angle of 45° to the x 
axis, the second straight line is less 
steep. 

Another illustration: x\ + y 2 — 1 = 
0 and {xJ2) 2 + y 2 — 1=0. The first 
equation corresponds to a circle S of 
radius 1 centered at the origin, while 
the second to the curve obtained from 
that circle via 2-fold stretching along 
the x axis (Figure 1.7.6). It is clear that 
the new curve is an ellipse. The proof 
of this is almost obvious. If we solve 
the equations of the first and second 
curves, 

y = / (*o) and y — f fo/c), (1.7.5) 
for x , we obtain 

= <P ( y ) and xjc = cp (y), 
i.e. x 1 = ccp (y) = cx 0l (1.7.5a) 

where cp is the inverse of function / 
(see Section 1.6). This corresponds to 
the initial statement that substitu- 
tion oixlciov x stretches the curve along 
the x axis c-fold. The important thing 
here is that / is one and the same func- 
tion in formulas involving x 0 and x lf 
(1.7.5). Therefore cp is also the same 
in the formulas involving x 0 and x u 
(1.7.5a). 

Here is another example: 
y = 10*° and y = 10^ /2 . (1.7.6) 

The inverse of a power function is a 
logarithmic function, whereby 

= iog y and V 2 = log 

i.e. x 1 = 2 log y. (1.7.6a) 

Thus, the graph of function y = 10 x/2 
is obtained from the graph of function 
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y = 10* by a 2-fold stretching of the 
latter along the x axis. 

What do we do if in the equation y = 
/ {x) the quantity kx is substituted 
for x ? To take advantage of the above- 
stated rule, let us recall that multipli- 
cation by k is the same as division by 
ilk, since kx = xl(ilk). Hence, ilk plays 
the role of c in the earlier formulas, 
and if, say k = 1/2, then c = ilk = 2. 
This means that the substitution of 
0.5#! for x 0 is the same as replacing x 0 
with xJ2 and leads to a stretching of 
the curve along the x axis by a factor 
of 2. If k = 3, then c = ilk = 1/3, 
and the replacement of x with 3x is the 
substitution of xl( 1/3) for x . 

What do all these substitutions mean 
geometrically? In the case where c > 1 
the result can be stated thus: when, 
in the equation of the curve, y is re- 
placed with ylc , the curve is stretched 
vertically by a factor of c, while in 
the substitution of xlc for x the curve 
is stretched horizontally by a factor 
of c . 

When 0 < c < 1, nothing really 
changes in our reasoning, with the excep- 
tion that the word “stretching” cannot 
be used, since stretching 2-fold means 
increasing the dimensions by a factor 
of 2, while stretching “1/3-fold” means 
multiplying the dimensions by 1/3, 
which means not multiplying but di- 
viding by 3, or “shrinking” the curve 
(in a certain direction) 3-fold. Hence, 
when we substitute ylc for y, the verti- 
cal dimensions change by a factor of c , 
which results in the curve “shrinking” 
in the vertical direction; for instance, 
at c = 0.5, a transition from curve 
y = f (x) to curve y = cf (x) = 0.5/ (x) 
means an actual reduction in height 
by one half. The same goes for the sub- 
stitution x x!c\ for 0 < c < 1 this 
substitution amounts to a “shrinking” 
of the curve along the x axis. 

Now let us establish the meaning of 
the substitutions y -*~ylc and x -+xlc 
when c is negative . Such a replacement 
can be conducted in two stages. We 
introduce c = — b = ( — 1) b, where b 
is positive, and carry out subsequently 



two substitutions: 


( = — -f- , SO that i/2 -= — !/i) . 

The first operation, substitution of 
yjb for y 0 , where b > 0, has already 
been discussed — it leads to a change 
in the vertical dimensions by a fac- 
tor of b. What remains to be clarified is 
the meaning of the change in sign of y, 
that is, the meaning of the substitution 
of — y for y. Obviously, the points of 
the curve y 0 = / {%) and y x — — / ( x ) 
corresponding to each other lie sym- 
metrically about the x axis (Fig- 
ure 1.7.7). Therefore, the curve y x = 
— / (#) is obtained from the curve y 0 — 
/ (x) as the mirror reflection of the lat- 
ter with respect to the x axis, and the 
same can be said of the curves 
F(x , y 0 ) = 0 and F ( x , — y x ) = 0, where 
the second equation is obtained from 
the first by reversing the sign of y. 

Reasoning in a similar manner, we 
can say that since points (x, y) and 
( — x , y) are symmetric with respect to 
the y axis, the replacement of x with 
— x in the equation of a curve (the 
transition from curve F (x, y 0 ) = 0 to 
curve F ( — x , y 2 ) = 0) replaces the ini- 
tial curve with a curve symmetric to 
the initial one about the y axis. Fi- 
nally, the transition from equation 
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Figure 1.7.8 

F ( x , y 0 ) — 0 to equation F (—x, — z/ 3 ) 
= 0 reduces to successive alternations 
of the sign of y (symmetry with respect 
to the x axis) and of the sign of x (sym- 
metry with respect to the y axis). But 
two successive reflections, from the x 
axis and the y axis, are equivalent to 
a reflection with respect to the origin 
0(0, 0) (see Figure 1.7.7). Thus, si- 
multaneous substitution of — x for x 
and of —y for y in the equation of a 
curve is equivalent to a symmetry trans- 
formation with respect to 0. 

Notwithstanding the simplicity of the 
above reasoning, beginners (for which 
this book is intended) often make mis- 
takes concerning the various transforma- 
tions we have just discussed. To clarify 
matters we give an example: 

F (x, y) 

= (* - 3) 2 + (y - 5) 2 - 4 = 0. 

This is the equation of a circle of ra- 
dius 2 centered at a point with coordi- 
nates x — 3 and y = 5. Figure 1.7.8 
shows the initial curve and also the 
following curves: 

F (. r , -if) 

= (x - 3) 2 + (-y - 5) 2 - 4 = 0 

F (—x, y) 

= (~x - 3) 2 + (y- 5) 2 - 4 = 0, 

F (—x, —y) 

= (-* - 3) 2 + (-y - 5) 2 - 4 = 0 

The letter F in all four cases denotes 
one and the same function. See what 
happens to the curve (circle) under the 


substitution of — x for x or the substi- 
tution of — y for y or under the simul- 
taneuos substitutions of — x for x and 
— y for y. A firm grasp of these rules 
will make it possible, after you have 
built a curve y = / (x) or F ( x , y) = 0, 
to picture the graphs of all functions 
of the type 

^ -/(*-=*) or *•(£=?, 

with arbitrary values of the constants a, 
b , c, and d . 

Here are two more simple (but im- 
portant) examples. In Figure 1.7.9 we 
have two curves: 

y = sin x and y = sin 3#, (1.7.7) 

where, of course, x is measured in ra- 
dians. The second curve has been com- 
pressed horizontally by a factor of 
three. 

The relation y = sin x is a periodic 
function: at x — 2n 6.3 radians 
(which corresponds to an angle of 360°) 
the sine has the same value as for 
x = 0. (Adding 2n to any angle leaves 
the value of the sine unchanged.) Thus, 
the graph of the first function, y = 
sin x , transforms into itself under a 
translation along the x axis (in either 
direction) by 2 jx, which simply means 
that sin a: is a periodic function with 
a period of 23X. The function y — sin 3x 
is also periodic, but the period here is 
less by a factor of 3. If x varies by 
2n/3 ^2.1 rad, 3x (the angle whose 
sine is laid off on the axis of ordinates) 
varies by 23X, and sin 3x returns to the 
same value: 

sin 3x= sin 3 • 

Use this example to think over the 
general assertion that the substitution 
of kx for a; in a periodic function re- 
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suits in the decrease in the period by 
a factor of k and an increase in the fre- 
quency by the same factor (the frequen- 
cy is the number of periods per unit 
length). 

The second example deals with loga- 
rithmic (and exponential) functions. 
Take two curves: 

Vi = lo g a x and y 2 = log h x , 

(1.7.8) 

or, which is the same. 
x = a v 1 and x = b y *, (1.7.8a) 

where it is assumed that both a and b 
are greater than unity. It can be estab- 
lished, by a method similar to that 
employed in the analysis of (1.7.6) and 
(1.7.6a) (see Section 1.4.9) that the 
second curve in (1.7.8) is obtained from 
the first by stretching the latter along 
the y axis by a factor c = log & a = 
(log a &)" 1 , where c is the modulus of 
base a with respect to base b. Similarly, 
any two exponential functions, y = a x 
and y = b x , are related in the same 
manner: the second is obtained from 
the first by stretching the first curve 
along the x axis by a factor c = log 5 a. 

Two simple but highly important 
concepts are related to what we have 
just said. A curve given by the equa- 
tion F (x, y) = 0 that retains its form 
under the substitution of — x for x 
(say, the parabola y = ax 2 ) is sym- 
metric about the y axis. A curve whose 
equation retains its form under the 
substitution of —x for x and — y for y 
(say, the hyperbola y = Hx or the cu- 
bical parabola y = x 3 ) is symmetric 
with respect to the origin. Finally, a 
curve whose equation F (x, y) — 0 maps 
into itself under the substitution of — y 
for y (say, the semicubical parabola 
y 2 = x 3 ; see Figure 1.5.7) is symmetric 
with respect to the x axis. All these 
statements follow directly from the 
geometrical meaning of the respective 
substitutions. 

A function y = f {x) whose graph is 
symmetric with respect to the y axis 
(Figure 1.7.10a) is said to be even, 
while if the graph of a function is sym- 



metric with respect to origin 0 (Fig- 
ure 1.7.105), the function is said to be 
odd. These names originate in the fact 
that parabolas of even order, y = 
ax 2m , are even functions, while para- 
bolas of odd order, y = ax 2m+1 , are odd 
functions (cf. Section 1.5). 

Finally, we note another important 
fact. It is necessary to understand (this 
is often a neglected aspect) that in ap- 
plied problems the procedure in which 
all ordinates y or abscissas x are changed 
in a given ratio (and nothing more) 
is related to a change in the units of 
measurement of y or x , so that the graphs 
obtained as a result of such a change of 
scale represent one and the same de- 
pendence of y on x to within, so to say, 
a change in scale (or the system of 
units). 1 * 14 

Suppose that on the axis of abscis- 
sas, Ox, we plot time and on the axis 
of ordinates, Oy, we plot distance (Fig- 
ure 1.7.11); the function y = / (x) 
(or, as we will often write, z = f (t)) 
fixes the law of motion. But since y 
and x are measured in different units, 
no comparison of the two quantities 
is possible (indeed, there is no way 
that we can compare 1 cm with 1 s or 
say which of the two quantities is great- 


1 * 14 However, if x and y are measured in 
the same units, the substitution of cy for y 
has a physical meaning, since the functions 
y = / (x) and y — cf (x) are different. 
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Figure 1.7.11 

er). The statement, say, that for a 
given point M ( x , y ) the segment OM 
lies at an angle of 45° to the axis of 
abscissas (see Figure 1.7.11) is mean- 
ingless, that is, if we change the scale 
of the y axis (go over from centimeters 
to meters), point M is replaced with 
a new point M' corresponding to the 
same moment in time, but now Z_POM' 
is not 45°, that is /_POM' 45°, while 
in all other respects M' is equivalent 
to point M. Thus, in physical problems 
the ellipses in Figures 1.7.5 and 1.7.6 
do not differ from the circles depicted 
in the same figures, and if we are inter- 
ested in a particular curve (ellipse), 
then it is wise to select units of 
measurement such that the ellipse 
is simply a circle. 

Similarly, any parabola y = ax 1 2 + 
bx + c can be reduced, as we have 
seen earlier, to a parabola y x = ax\ 
by a translation, or shift (i.e. by a tran- 
sition from coordinates x and y to a set 
of new coordinates, x 1 = x + p and 
z/i = y + q, where p and q must be 
chosen appropriately). But physically 
such a translation means nothing else 
but a change in position of the origin 
from which the physical quantities x 
and y are reckoned, say the initial 
moment t = 0 and the initial position 
z = 0, and so does not influence in any 
way the physical process. After this 
we can transform y x = ax\ into the 
“unit” parabola Y = ±X 2 (or even to 
Y = X 2 if we select the positive di- 
rection along the y axis 1,15 ) by choosing 

1 - 15 While in the case of time, t. we have 

a natural direction of growth, which corre- 

sponds to the simple fact that the future and the 


a new system of units along the y x axis* 
(changing the scale on this axis); it is- 
in this sense that, as mentioned in 
footnote 1.5, the parabola y — x 2 can 
be considered as the general case of 
a parabola. 

In the same way, any linear-fraction- 
al function y — ( kx -f- 1)1 (mx + n) can 
be reduced, by changing the position of 
the origin on the x and y axes and the 
scale on the y axis, to the “unit” hyper- 
bola” Y = l/X (see Exercise 1.7.6). 
Similarly, any cubic function (poly- 
nomial) y = ax s + bx 2 + cx + d can 
be reduced to one of the three func- 
tions: y = x 3 + z, y = x 3 — x, or y = 
a? (no further reduction is possible). 
The list of examples can be continued. 

Lei us discuss a more complicated example- 
involving cubic functions (polynomials): 

y = ax 3 + bx 2 + cx -j- d 
= a (x 3 + px 2 + qx + r), (1.7.9)- 

where p = 6/a, q = c!a , and r = d!a (of 
course, a ^ 0). In Section 1.5 we saw that the- 
substitution X — x + p/3 transforms (1.7.9) 
into 

y = a(X* + PX + <?), (1.7.9a) 

or Y = aX* + BX = a (X 3 + kX), 

where Y = y — aQ, B = aP , and k — 
B/a (= P), with P = q — p 2 / 3 and Q = 
(2/27) p 3 + r — (1/3) pq (see formulas (1 .5.6) 
and (1.5.6a)). At k = 0 (i.e. if B = 0) the 
function (1.7.9a) has the form Y = aX 3 , or 

yi = *i> (1.7.10a) 

where x 1 = X (the X-coordinate does not 
change) and y x = Y/a. If k >> 0 (i.e. B and 
a are of the same sign), we put x 1 = X/Y k, 
i.e. X = Ykx ± . Then (1.7.9a) becomes 

Y = a ( k 3 f 2 x\ + fc 3 / 2 #!) = ak 3 / 2 (x 3 + x x ) , 

that is, we have the law 

£/i = *i + x i , (1.7.10b) 

where y 1 — Y/akY Finally, if k < 0 (a and 
B are of opposite sign), a substitution th at is - 
similar to the on e used above, x x = X/Y I k |, 
that is, X = Y | k | x ± , transforms (1.7.9a) 
into 

Y = a(\k 3 / 2 \x 3 + k\k\V 2 x 1 ) t 


past are not equivalent, for distance, z, this- 
is not the case, and we are free to choose* 
any direction for the increase in z. 
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that is, 

Vi — A— (1.7.10c) 

where y l = — Y/akY ' | k |. Thus, for a physi- 
cist any scientific function of the type (1.7.9) 
that links two quantities of different dimen- 
sions, y and x, is equivalent to one of the 
three functions (1.7.10a)-(l. 7.10c) (see Fig- 
Tires 1.5.1a, 1.5.2, and 1.5.3 and Exercises 1.7.8 
-and 1.7.9). 

Let us use the following example to 
illustrate the aforesaid. It is well known 
that the time dependence of the alti- 
tude h of a stone thrown upward from 
.a tower of height w with an initial 
velocity v is given by the formula 

h = w + vt — jrt 2 , (1.7.11) 

where g ~ 9.8 m/s 2 is the acceleration 
of gravity. If, however, for the zero 
level we take not the ground level but 
the greatest altitude reached by the 
stone, h 0 , and for time zero we take 
the moment f 0 when this altitude is 
reached, then in the new coordinates, 
■K = h — h 0 and t x = t—t 0 , Eq. (1.7.11) 
describing the motion of the stone 
simplifies considerably: 

(1.7.11a) 

But the coefficient — gl 2 in Eq. 
(1.7.11a) has nothing to do with the 
physics of the process and is related 
^solely to the choice of the system of 
units. Equation (1.7.11a) can be simpli- 
fied still further by dropping the minus 
sign on the right-hand side; for this we 
need only agree that h is measured from 
point h = h 0 not upward but down- 
ward. Next, we can go over to a “na- 
tural” system of units in which the accel- 
eration of gravity is equal to 2 (time 
can be measured in seconds, as usual, 
while distance can be measured in units 
*of gl 2, that is, approximately, 4.9 m). 
Then, in the new coordinates, H = 

(2/g) (h — /i 0 ), T = t x = t — t 0 , and 
the law (1.7.11) or (1.7.11a) assumes 
the “canonical” form H = T 2 . 

In a similar way we can always trans- 
form logarithmic and exponential func- 
tions often encountered in scientific laws 


into such a form that the bases acquire 
a convenient form, since the transi- 
tion from one base to another means 
actually changing only the units of 
measurement of the quantities (cf. the 
above material on logarithmic and ex- 
ponential curves). Therefore, each time 
we encounter the logarithmic func- 
tion y = log c x , we can freely assume 
the base to be equal to 10 or to e ^ 
2.272 (see Section 1.4.9) or even to 
two; 1 - 16 when exponential functions are 
considered, y = d x , the base d is usu- 
ally assumed equal to e . 


Exercises 


i/2 

1.7.1. Construct the curves — — |- 

i-o, <i±a + i£=a .„d 

- — — h — ^ 1=0 (it is natural here 


to employ the fact that x 2 +y 2 — 1 = 0 
is the equation of a unit circle centered 
at the origin). 

1.7.2. Construct the curve y = sin x by 

taking, for example, eight values of x in the 
interval from — ji to +ji by 0.25ji each. Employ- 
ing this graph, sketch the graphs of the func- 
tions (a) y = 2 sin x, (b) y = sin 0.5-z, 
(c) y— 3 sin 3x , (d) y = cos x, (e) y = cos x + 
sin x (= y 2 sin (x + ji / 4 )), (f) y — cos 2 x 

(— 1/2 + (1/2) cos 2x ), (g) y = sin 2 x 

(= 1/2 — (1/2) cos 2x). [Hint. All these curves 
may be obtained via translation, stretching, 
and shrinking the sinusoid y = sin x\ to solve 
Exercises 1.7. 2d, f, g, employ the identity 
cos x = sin (x + ji/2).] 

1.7.3. Plot the following curves: (a) y = 
± Y z 2 -l, (b) y = ±Y * 2 + 1, (c) y = 2 ± 
Y {x — l) 2 — 1, and (d) 4 y 2 + 4y — x* = 0. 
[Hint. Construct curve (a) using twenty values 
of x in the interval from — 5 to +5 with each 
value differing from the adjacent one by 0.5 
(and each being a multiple of 0.5). Transform 
the equations of the curves (a) and (b) to 
y 2 — x 2 + 1 = 0 and x 2 — y 2 + 1 = 0, re- 
spectively, and note that (b) is obtained from 


1 * 16 For instance, in the theory of informa- 
tion, where the amount of information is ex- 
pressed by logarithmic functions of the various 
variables, all logarithms are assumed binary , 
which is of no importance, of course, and is 
connected solely with the system of units 
employed (with the fact that information is 
measured in bits). The interested reader can 
refer to the book of A.M. Yaglom and I.M. Ya- 
glom, Probability and Information , Reidel, 
Dordrecht, 1983. 
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(a) by interchanging x and y. To construct 
curve (d), transform the equation of this curve 
into 4 (y + 1/2) 2 — .r 2 — 1 = 0 and then, by 
translating and shrinking curve (b), you will 
arrive at the result.] 

1.7.4. Write the equations of the curves 
obtained from the curve y = x 3 — x (see 
Figure 1.5.3) by (a) shrinking along the axis 
Oy with a ratio 1/3 (i.e. diminishing all 
vertical dimensions 3-fold), (b) substituting 
xf2 for x , and (c) substituting —y/2 for y . 
Sketch the three curves. 

1.7.5. Prove that stretching (or shrinking) 
the unit circle x 2 + y 2 — 1 = 0 to any straight 
line with any ratio results in an ellipse. 
[Hint. This problem may be formulated as 
follows: stretching (or shrinking) from a circle 
(x — a) 2 + (y — b) 2 — r 2 = 0 away (or toward) 
the x axis, that is, substituting ky for y, 
+T*aDsffirms_tbe„ciTTla.inta _ an., ellinsg.l , 

1.7.6. Prove that the graph of an arbitrary 
linear-fractional function y = (ax + b)f 

(cx + d) with c =#= 0 and A = ad — be =#= 0 
may be obtained from the graph of inverse 
proportionality y = k/x, where k = — A/c 2 , by 
shifting the latter to the left by d/c units (or by 
| die | units to the right if die < 0) and upward 
by afe units. 

1.7.7. Prove that every function may be 
represented by the sum of an odd and even 
function. [Hint. Employ the identity 

/(*)= 4 - (/(a:) + /(- * )) 

+ \{1(x)-f{-x).\ 

1.7.8. Prove that no two curves in 
(1.7.10a)-(1.7.10b) can be reduced to each 
other by applying substitutions of the 
form x' = ax + p and y' — by + q (where, 
of course, a =#= 0 and b =#= 0), which are equi- 
valent to changing the position of the origin 
for x and y (the translation of the origin) 
and the units of measurement for x and y. 

1.7.9. Find the rule that enables us 

to establish from the coefficients a, b, c and 
d whether (1.7.9) can be reduced to 

(1.7.10a) or (1.7.10b) or (1.7.10c). 

1.8 Parametric Representation 
of a Curve 

Let us now study the motion of a ma- 
terial point, M — M (x, y). Each of the 
coordinates, x and y, changes in the 
course of time t, and the motion of 

point M is specified by two functions 
x = x (t) and y = y (£), say, 

x = cos t , y = sin t. (1.8.1) 

These relations can be depicted grap- 
hically as two curves by plotting t 



Figure 1.8.1 

"U, J: xk~of~ahscksas„a nd_ sr. _th a_ _ 

axis of ordinates in one drawing, and 
t on the axis of abscissas and y on the 
axis of ordinates in the other (Fig- 
ure 1.8.1). 

Let us investigate the trajectory of 
point M. To each value of t there cor- 
respond a value of x ( t ) and a value of 
y (t). What curve will point M = 
M (x (t) 9 y (t)) = M ( t ) describe as t 
varies? To answer this question, we can 
eliminate t from the two equations 
x = x (t) and y = y (t). This yields 
an expression that will involve only y 
and x , that is, an equation of the type 
y = y (x) or F (x y y) = 0. Then we 
construct the curve in the usual fashion 
by specifying various x's and find- 
ing the corresponding y's. For instance, 
in our example, 
x 2 + y 2 = cos 2 t + sin 2 t — 1, 
that is, 

y=±]/l — x 2 , or x 2 + y 2 — 1 — 0, 

so that the curve (1.8.1) is a circle 
of unit radius in the xy - plane (Fig- 
ure 1.8.2a). 

However, it often happens that even 
comparatively simple expressions for 
x ( t ) and y ( t ) lead to such complica- 
ted expressions when trying to eli- 
minate t that it makes no sense to tackle 
them. For instance, if 

x = x (t) = a x £ 4 + b-J* + c x t 2 

+ dit + y — V (t) — 

+ b 2 t z + “1“ d 2 t -4- £ 2 > 
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Figure 1.8.2 


then to eliminate t we would have to 
solve a quadratic equation, and this 
lead to expressions that in view of 
their unwieldiness can simply not be 
written. Yet it is possible to construct 
the curve in the xy-plane without 
eliminating t : it suffices to specify 
various values of t and find x and y 
for each of them. To illustrate this 
point, we construct the following table 
of values of x (£) and y (£) corresponding 
to the simple example (1.8.1): 




0 

jt/4 

jt/2 

331/4 

jt 

X = 

cos t 

1 

0.7 

0 

—0.7 

—1 

y= 

sin t 

0 

0.7 

1 

0.7 

0 


t 



5n/4 

3n/2 

7jt/4 

2ji 

x = 

cos t 


-0,7 

0 

0.7 

1 

y= 

sin t 


-0.7 

-1 

-0.7 

0 


It is clearly not necessary to take t 
greater than 2n because the values 
of x and y repeat. Using this table, 
we can plot the points of the curve. 
In so doing we employ only the values 
of x and y. Those values of t for which 
the x's and i/’s have been computed 
are no longer needed for plotting the 
points. “The Moor has done his duty; 
the Moor can go”, as the German pro- 
verb has it. 

This method of representing a curve 
or, what is the same thing, of specifying 
a functional relationship y = y (x), by 
two functions, x (t) and y ( t ), is called 
parametric representation, and t is 


called a parameter - 1 17 In physical prob- 
lems, a parameter often has a definite 
physical meaning. For instance, in the 
example with which we started this 
section, t may be the time variable, 
and in this case both x ( t ) and y ( t ) 
are functions of time. Of course, one 
can be interested only in the shape of 
the trajectory described by point M(x , y) 
but it is also interesting to know the 
velocity with which the point moves 
at a certain moment of time and the 
position of the point at various moments 
in time. To find these two quanti- 
ties, we must retain the values of the 
parameter and for each point M = 
M(t) write the number t (it goes 
without saying that only a finite num- 
ber of points can be plotted on a draw- 
ing, e.g. see Figure 1.8.26). In this 
way, a “parametrized” curve yields 
more information than the trajectory of 
point M alone does (compare Fig- 
ures 1.8.2a and 1.8.26). 

Let us take the following curves: 

(1) x = cos t, y = sin t; 

(2) x = cos t, y = —sin t; 

(3) x = cos 3t, y = sin 3£; 

(4) x = sin 3 1 , y = cos 3 1 . 

In each case, if we eliminate t, we 
arrive at the equation of a unit circle 
in the xy- plane, or x 2 + y 2 — 1=0. 
So in what respect are all these four 
cases different? 

In the first three cases, at t = 0 the 
point lies on the axis of abscissas, 
while in the fourth case it lies on the 
axis of ordinates. In cases (1) and (3) 
the point (one can imagine a heavy 
material point) is moving along a circle 
counterclockwise, while in cases (2) 
and (4) it is moving clockwise. In 
absolute value, the angular velocity of 
rotation (or the rate of rotation) or 

1 - 17 The word “parameter” was invented 
by mathematicians of Ancient Greece for just 
this purpose and has no exact translation. 
Webster s Third New International Dictionary 
gives the following definition: “An arbitrary 
constant characterizing by each of its values 
some member of a system (as of expressions, 
curves, surfaces, functions).” 
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even the linear speed | v | (since the 
radius of the circle is equal to unity) 
is 1 rad/s (i.e. | u | = 1 m/s if the 
radius of the circle is 1 m) in cases(l) 
and (2), and 3 rad/s (or | v | = 3 m/s) 
in cases (3) and (4). 

Here is another example. Let us 
build the curve that a point on the 
rim of a bicycle wheel describes when 
the cyclist is moving along a straight 
line, the point A on the circumference 
of a circle of radius r as the circle rolls 
along a straight line. The curve is called 
a cycloid. 11 * 

Let us take a horizontal straight 
line along which our “wheel” is moving 
as the x axis, and let us assume that at 
the beginning of the motion, t = 0, 
the center of the wheel, Q — Q 0 , lies 
on the y axis at the point (0, r ). We 
will also assume that the wheel rolls 
with a constant velocity. For instance, 
let us assume that in unit time the 
wheel rotates through a unit angle. 
Then, in the course of time t, the wheel 
will travel a distance rt in the positive 
direction of the x axis (note that the 
wheel does not slip in the process of 
motion, so that, rotating through an 
angle t , it travels a distance equal to 
the length of the arc BA t in Figure 1.8.3, 
which amounts to rt). The center (or 
axis) of the wheel Q 0 , will occupy the 
position Q t ( rt , r), and the point A 0 
will occupy A t . (We wish to find the 
coordinates of point A%. 

From the right triangle QtA t P (see 
Figure 1.8.3), in which ALA t Q t P = t 
and QtA t = r, we get A t P = r sin t 
and Q t P = r cos t. Since BO = PT — 
rt and Q t B = Q 0 O = r, we get 

1 • 18 From the Greek kykloeides meaning 
circular. 


x = TA t = TP — A t P = rt — rsint, 
y = OT = BP = Q t B - Q t P 
= r — r cos t. 

The final result is 
x = r (t — sin t), y = r (1 — cos t) 

(1.8.2) 

(the solid curve in Figure 1.8.3 depicts 
the cycloid). 

Exercises 

1.8.1. Construct the curves given by the 
following equations: (a) x = cos t, y = sin 2 1 
and (b) x = cos t, y = sin 3*. [Hint. Since 
sin 3 1 varies rapidly, you must take close- 
lying values of t, say, t = 0 , 0 . 1 , 0 . 2 , . . ., 
or, if you have no trigonometric tables with a 
column for radians, t = 0, 5°, 10°, 15°, . . .] 

1.8.2. Construct the curve x = cos (5f + 1), 
y = sin (5* + 1). 

1.8.3. Construct the curve x = cos t, y = 
cos ( t + ji/4). 

1.8.4. Construct the curve x = cos t, y = 
cos t. 

1.8.5. Express the parametric dependence 
(1.8.2) explicitly, in the form of a function 
* = 9 (*/)• 

1.8.6. A circle s of radius r is rolling 
without slipping along a circle S of radius R. 
Write the parametric equations of the curve 
described by a point A on the moving circle if 
(a) s lies outside S and (b) s lies inside S or 
(for R < r) S lies inside s. The curve (a) is 
called an epicycloid , and the curve (b) a 
hypocycloid.) 

Consider the following particular cases: in 
case (a) r = R, r = R/ 2, and r = 2R; in 
case (b) r = R/ 2, r = R/ 3, and r = R/ 4. 
(At r = R the epicycloid is called a cardiod 
(from the Greek kardia meaning “heart”), at 
r — Rl 3 and r = Rf 4 the hypocycloids are 
called the Steiner M 9 curve and the astroid 
(from the Latin word astrum meaning “star”). 
What is the appearance of the hypocycloid at 
r = R/ 2?) 

1.9* Some Additional Topics 
from Analytic Geometry 

Above (in Section 1.2) we considered 
a number of geometrical problems 
solved by the coordinate method, or by 
methods of analytic geometry. 

Suppose that A 1 (x u y x ), A 2 (x 2 , y 2 ), 
and A 3 ( x 3 , y 3 ) are three points on 
the coordinate plane (Figure 1.9.1). 

1 19 Jacob Steiner (1796-1863), an out- 
standing German geometer. 
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How does one find the angle Z-A 2 A X A 3 = 
a? In Section 1.2 we found that 


tan a 12 :=— — — and tana 13 -= — — — , 

where a 12 and a 13 are the angles formed 
by the x axis and the segments A 1 A 2 
and A X A 3 (see formula (1.2.5)). But 
Figure 1.9.1 clearly shows that 
a = Z-A 2 A 1 A 3 = a 13 — a l2 . 

Since, according to a well-known formu- 
la from trigonometry, 


tan (P — y) 
we have 


tan ft — tan y 
1 + tan P tan y 


tan a = tan (a 13 — -a 12 ) = 


tan q l3 — tan q 12 
1 + tan a l3 tan a 12 


y% y i y 2 y i 

x 3 — x t x 2 — x : i 

1 | y 3 y i y 2 y 1 

x 3 — x ± x 2 — x x 


or, finally, 

tan a = (^2 — ^ 1 ) — (?/ 2 — ^ 1 ) ( x 3 — x i ) 

(x 3 x 1 )(x 2 a:!)-!- (y 3 — y±) (y 2 — 1 / 1 ) * 

(1.9.1) 

Thus, knowing the coordinates of points 
Ai, A 2 ,and A 3 , we can find the angle 
a = /LA 2 A 1 A 3 . 

Combining (1.9.1) with the well- 
known formulas sin a = [tan 2 a/(l + 
tan 2 a)] -1 / 2 and cos a = (1 + tan 2 ^" 1 / 2 , 
we arrive at the following formulas: 


sin a = 


( x 2 ~ x i) (y 3 - y x ) - (x 3 — Xl ) (y 2 — yj)_ 

Y (^2 — ^i) 2 + (2/2 — ^/i) 2 Y ( x 3— *i) 2 + (y 3 —yi ) 2 

(1.9.2a) 

cosa = 

(x 2 — x x ) (x 3 — x x ) -f- (y 2 — y{) ( y 3 — y x ) 

Y ( x 2 — ^i) 2 + (i/ 2 — yi) 2 Y ( x 3— x i) 2 + (y 3 — yd 2 

(1.9.2b) 


The comparatively complicated for- 
mula (1.9.1) (and also the formulas 
(1.9.2a) and (1.9.2b)) are rarely 
employed. All these formulas hold true 
for any values (arbitrary both in size 
and sign) of the coordinates of the 
points considered and the differences 
of these coordinates provided that we 
agree to reckon a from A 1 A 2 to A X A% 
and that a can be both positive and 
negative. 

Since tan a is meaningless when a = 
n/2 (the tangent of a tends to* 
infinity as a-^n/2 and is said to be- 
come infinite, the denominator in the 
expression for tan a must be zero at 
a = n/2), formula (1.9.1) yields the 
following condition necessary and suffi- 
cient for the straight lines A X A 2 and 
A r A 3 to be perpendicular to each other : 

{x 3 — x 1 )(x 2 — Xj) 

+ ( 2 / 3 — yi)(y 2 — 2 / 1 ) = 0 . (1.9.3) 

The following conclusions can be made 
on the basis of (1.9.1), (1.9.2a), and 
(1.9.2b). Note that the expressions in 
the denominators in the right members 
of (1.9.2a) and (1.9.2b) are the dis- 
tances A 1 A 2 and A X A 3 (cf. (1.2.4)): 

V (z 2 — Zi) 2 + (2/ 2 -2/i) 2 = A A 2> 

l/(x 3 — *i) 2 + (2/ 3 — y i ) 2 = A l A 3 . 

From the school course of geometry 
the reader should know that 

SaAiA*a 3 = X A t A 3 X |'sina|, 

where S AAtAtAs is the area of triangle 
A x A 2 A 3 . This*. yields the following for- 
mula for the^area: 

A j A 2 A 3 = ~2 l(®2 *l) (2/3 2/l) 

— (x 3 — Xi)(y 2 — y f )|. (1.9.4) 

On the other hand, S AAiA A , — 
(V2)AjAt X ft, where h is the distance 

1 - 20 In formulas (1.9.4) and (1.9.5) you 7 can 
drop the vertical bars and assume that 
S A ai a 2 a * an( l Ai A* possess a sense of 

direction, so to say, that is, their sign deter- 
mines on which side of ^4 2 point A 3 lies. 
We will not dwell on this any longer. 
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from point A 3 to the straight line A X A 2 . 
In view of (1.9.4), we then have the 
following formula for the distance 
h = h AsiAl A 2 from point A 3 to straight 
line A 1 A 2 : 

h As , AiA 2 

= 1 Xt) (y 3 — yi) — (^3~x 1 )(y 2 — y 1 ) I 

V (*2 — * l ) 2 + (^ 2 — */ l ) 2 

(1.9.5) 

We note also that the s stem of 
coordinates in a plane for geometrical 
problems is rather arbitrary: to specify 
a system of coordinates we must fix 
the point 0 (the origin) and the two 
axes Ox and Oy (perpendicular to each 
other), which, however, in all other 
respect can be chosen at our will. 

For example, suppose that xOy and 
x'O'y ' are two different coordinate sys- 
tems (Figure 1.9.2). To establish the 
relationship between the “old” coordi- 
nates (x, y) and the “new” coordinates 
(x ' , y') of one and the same point M, 
we introduce an “intermediate” coordi- 
nate system x 1 Oy 1 whose origin coin- 
cides with the origin of the old system 
while the directions of the x x and y x 
axes coincide with the directions of the 
x and y' axes, respectively. To relate 
the coordinates (x, y) and (x ly y x ) of 
a point, it is convenient to consider 
two systems of polar coordinates with 
the same pole 0 and the polar axes 
Ox and Ox ± , respectively (see Fig- 
ure 1.9.2). If Z LxOx 1 — a, then point 
M(r, cp) (i.e. point M with the polar 
coordinates in the first system of units 
being r and cp) has coordinates r x and cp x 


in the second system, where, obviously, 
r x = r and cp x = cp — a. Therefore (see 
Eqs. (1.2.3)), x = r cos cp, y — r sin cp 
and x x = r x cos cp x — r cos (cp — a) r 
y x = r sin (cp — a), that is, 

x x — r cos (cp — a) = r (cos cp cos a 
+ sin cp sin a) 

= r cos cp cos a + r sin cp sin a 
= x cos a + y sin a, 
y x — r sin (cp — a) = r sin cp cos a 
— cos cp sin a) 

= — r cos cp sin a + r sin cp cos a 
= — x sin a + y cos a. 

On the other hand, from the same- 
Figure 1.9.2 it follows that 
x’ = x 1 + a, V r = Vi + 6* 

where a = O' P and b = PO are the* 
coordinates of the old origin 0 in tlm 
new system of coordinates x'O'y'. 

Thus, the final result is 1 * 21 
x = x cos a + y sin a + a, 
y' — — x sin a + y cos a + b. (1.9.6) 

These formulas are fundamental to 
geometry. The method of cooridnates 
makes it possible to characterize every 
point in a plane by a pair of numbers,. 
x and y, the coordinates of the point; 
if we have two points, then two pairs 
of coordinates are needed; and a set 
of points can be replaced with a set of 
pairs of numbers. For instance, a curve 1 
in a plane has corresponding to it 
a one-parameter set of pairs of numbers, 
x (t) and y ( t ), that depend on parame- 
ter t (cf. Section 1.8). The transition 

1,21 As is customary in analytic geometry 
we restrict our discussion only to right-handed 
coordinate systems, that is, systems in which 
a rotation of the positive semiaxis Ox through 
an angle of 90° that maps the semiaxis into 
the positive semiaxis Oy must be performed 
in the positive direction, or counterclockwise. 
The reader can easily show that if xOy is 
a right-handed coordinate system and x r O'y r 
a left-handed one, then 

x' — x cos a + y sin a + a, 
y' = x sin a — y cos a + b. 

(What is the meaning here of angle a and line 
segments a and 6?) 
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Figure 1.9.3 

from one system of coordinates to 
another in no way affects the geomet- 
ric properties of curves, and for this 
reason in geometry the only relation- 
ships between coordinates of points that 
have any meaning are those that retain 
their form under any transformation 
of the (1.9.6) type, that is, under 
;a transfer from coordinate system xOy 
to any other coordinate system x'O'y'. 
And these are the relations that were 
studied above. 

The aforesaid can be worded in 
•a different manner. The equation y = 
f(x) or F ( x , y) of a certain curve T 
depends, obviously, not only on the 
shape of the curve but also on the 
•choice of the coordinate system; in 
other words, the equation describes not 
the curve alone (Figure 1.9.3a) but 
-a much more complicated object: the 
curve T and the coordinate axes (the 
axis of abscissas and the axis of ordi- 
nates; and not only the axes but also the 
unit segments on the axes that fix the 
units of measurement for x and y; see 
Figure 1.9.36). But of geometric mean- 
ing (and we are interested here in geo- 
metry and not in physics 1 * 22 ) are only 
those expressions connected with the 
equation of a curve that retain their 
form under a substitution of variables 
via formulas (1.9.6). For instance, in 
the equation of a circle of radius r 
centered at point Q (a, 6), 

(x — a) 2 +(y — b) 2 — r 2 , or 

(x — a) 2 + (y — b) 2 — r 2 = 0, (1.9.7) 

the quantities a and b characterize 
(only) the position of the circle in the 
coordinate plane, with the number [ r 

1 * 22 In physics the situation is somewhat 

different, in this connection see Section 9.8. 


related to the “geometry” of the curve 
(with the size of the circle; see Exer- 
cise 1.9.6). 

To demonstrate the aforesaid, let us 
turn to formula (1.2.4) for the 
distance r 12 between two points. If 
A 1 (#!, y x ) and A 2 (x 2l y 2 ) are two points, 
then, in view of (1.2.4), the square of 
the distance between them, r 2 2 , is equal 
to (x 2 — x x ) 2 + (y 2 — y x ) 2 . Let us in- 
troduce new coordinates, x ' and y f . 
Then, according to (1.9.6), 

x[ = x x cos a + y x sin a + a, 
y[ = —x x sin a + y x cos a + 6* 

and, similarly, we can also find the new 
coordinates x' 2 and y' 2 of point A 2 in 
terms of the old coordinates x 2 and 
y 2 . We then have 

x' 2 — x[ — (x 2 — %) cos a 
+ (y 2 — 2/i) sin cc, 
y '2 — y'i = —(% 2 — ^1) sin a 
+ (y 2 — yi) COS a 

and, respectively, 

(x 2 — x[) 2 + {y 2 — y[) 2 
= [(x 2 — x x ) cos a + (y 2 — y x ) sin a] 2 
+ [—(^2 ^ %i) sin a 
+ O/2 — z/i)cos>l 2 
= (x 2 — x x ) 2 (cos 2 a + sin 2 a) 

+ 0/2 — yi ) 2 (sin 2 a + cos 2 a) 

= (*2 — Vi )* + ( */2 — £/ l ) 2 , 

which is proof of the independence 
of r 12 of the choice of the system of 
coordinates. 

Clearly, two straight lines y = 
k x x + b x and y = k 2 x + b 2 (Figure 
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1.9.4) are parallel if and only if 
their slopes are the same, that is, if 
= k 2 . Let us now consider two strai- 
ght lines y = k x x and y = k 2 x , which 
pass through the origin 0 and are paral- 
lel to our straight lines (which are not 
■pKi'aVnJi \v 'na'dn T/ftrer, ‘m 'geirei^Jr, 
the dashed lines in Figure 1.9.4). Since 
there are two points A x (1, k x ) and 
A 2 (1, k 2 ) that belong to the new straight 
lines (i.e. the ones that pass through 
0 ), the angle a, equal to the angle 
Z_A x OA 2 between our straight lines 
(it is unimportant whether these are 
the new lines or the old lines), is deter- 
mined via formula (1.9.1): 


tana = 


(* 2 -0) (1—0) — (k\ — 0) (1 — 0) 

(1-0) (1 — 0)-*-(fc 2 — 0) (*! — 0) 


(since here the points O (0, 0), A x (1, k x ), 
and A 2 (1, k 2 ) play the role of points 
A l (x t , y t ), A 2 (x 2 , y 2 ), and A 3 {x 3 , y 3 ) 
in formula (1.9.1)). Thus, the final result 
is 

tan a = -~rh • (1.9.8) 

i + k 2 ki v ' 

In particular, two straight lines y = 
k x x + b x and y = k 2 x + b 2 are per- 
pendicular to each other if and only if 

k x k 2 -f- 1 = 0, or k x k 2 = — 1. (1.9.9) 

(Why?) 

Now, suppose that y = kx + b and 
y = kx + b x are two parallel straight 
lines l and l x whose slopes are the same, 
k , and which intersect the y axis 
at points B (0, b) and B x (0, b x ) (Fig- 
ure 1.9.5). In this case, obviously, 
BB X = | &i — b | , on the other hand, the 
perpendicular B X P dropped from point 
B x onto l forms an angle a with the y 
axis that is equal to the angle between l 
(or l x ) and the x axis (see Figure 1.9.5, 



Figure 1.9.5 


where Z~BB X P and /LxQl are angles with 
mutually perpendicular sides). But 
tanZjr(?Z = tana = k 7 by the very defi- 
nition of the slope of a straight line, 
which means that tan Z^BB X P = k 
and, hence, 

cos Z. BBaP = cos a = — — — 

V l + tan 2 a 

_ 1 
” j/T+ik 5 ‘ 

From triangle BB X P (see Figure 1.9.5) 
we obtain 

B l P = BB i cos a = |6 - 1 ~ 6 1 

/l+A 2 


We have found that the distance 
d = B X P between parallel lines l and Z, 
is given by the formula 


_ J fti— ft I 

— V i-t-k 2 


(1.9.10) 


If l is a straight line given by the 
equation y = kx + b and M ( x 0 , y 0 ) 
is an arbitrary point (see Figure 1.9.5), 
then another straight line Z, passing 
through point M is given by the equa- 
tion y — y 0 = k (x — x 0 ), or y = kx + 
by, where by — y 0 — kx 0 (see for- 
mula (1.3.3)). Hence, in view of (1.9.10) 
the distance h from point M to straight 
line l is given by the formula 


h = 


| i/ 0 — to 0 — 6 | 

V l+F 


(1.9.11) 


(note that in the numerator on the 
right-hand side of (1.9.11) we have 
the absolute value of the result of sub- 
stituting the coordinates of point M 
into the left-hand side of the equation 
of the straight line, y — kx — 6 = 0). 


Exercises 


1.9.1. Given a point M and a straight line Z: 
(a) M ~ M (1, -1), y = x + i; (b) M = 
M (0, —1), y = 4; (c) M = M (—2, 0), 
y = — x — 2; and (d) M = M (0, 0) (the 
origin), y = (3/4) x — 1/4. Draw a line p 
that passes through M and is perpendicular 
to L 

1.9.2. Given three points A x , A 2 , A 3 : 
A x (0, 0), A 2 (4, 3), A 3 (-6, 8); A ± (4, -4), 

J " A x (1, 2), 

A A]A 2 A q , 


A 2 ( 4,0), A 3 (-l, 1); and 

A 2 ( 2, 1), A 3 (- 1, 1). (a) Find 


5-0946 
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(b) calculate S AAlA2A3 , and (c) find ft Al , A2A3 . 

1.9.3. Given two parallel straight lines l 

and l 4 : (a) y — x + 2 , y = x — 2 ; and 

(b) 3x — 4 y — 1 = 0, — 3x + Ay = 0. Find the 
distance d between the lines. 

1.9.4. Given the point M and the straight 
line l of Exercise 1.9.1. Find the distance h 
between M and l. 

1.9.5. Suppose that A x = A t (x ly y x ), 
A 2 — A 2 (#25 1 / 2)1 A 3 — A 3 (^ 3 , y s ), and A 4 — 
A 4 (x 4 , y 4 ) are four points in the xz/-plane. 
Prove that A X A 2 ± A S A 4 if and only if 
(x 2 — x ± ) (x 4 — X 3 ) + (y 2 — y ± ) (y 4 — V 3 ) = °* 

1.9.6. Establish how the equation (1.9.7) 
of a circle transforms under a transformation 
to new coordinates x' and y' via (1.9.6). Ver- 
ify that the quantities a and b in (1.9.7) 
change (i.e. that the new equation will be of the 
same form but with different a and b) while r 
remains the same. [Hint. To solve this problem 
it proves convenient to “reverse” formulas 


(1.9.6), that is, to express the old coordinates 
x and y in terms of the new coordinates x' 
and y'.] 

1.9.7. Prove that under a transition (1.9.6) 
to another system of coordinates the following 
conditions and formulas retain their form: 
(a) the condition k 4 = k 2 that two straight 
lines are parallel, (b) the condition ( 1 . 9 . 9 ) 
that two straight lines are perpendicular to 
each other, (c) the formula (1.9.8) for the 
angle between two straight lines, (d) the 
formula (1.9.10) for the distance between two 
parallel straight lines, (e) the formula ( 1 . 9 . 11 ) 
for the distance from a point to a straight line, 
(f) the formula (1.9.4) for the area of a trian- 
gle, (g) the formula (1.9.1) for an angle in a 
triangle, (h) the formula (1.9.5) for the dis- 
tance from a point to a straight line, and 
(i) the conditions for two straight lines being 
perpendicular to each other (see Exercise 1.9.5) 
[Hint. See the hint to Exercise 1 .9.6.1 
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2.1 Motion, Distance, and Velocity 

Let us examine the translational mo- 
tion of an object, or body, along a 
straight line. Denote the distance of some 
point M of the body to a specific 
pointOonthe line (the coordinate of 
points /byzrWe will consider tne ’dis- 
tance in one direction to be positive and 
in the opposite direction negative. For 
example, suppose that the straight line 
along which our body is moving is 
vertical. Points above 0 will corre- 
spond to positive values of z, and those 
below 0 to negative values of z. It is 
often convenient to assume that the 
body is small, so that we can simply 
speak of a material point or parti- 
cle, M, and of the distance of this point 
from a definite point 0 on the straight 
line, or the origin of our coordinate 
system. 

Problems on the motion of bodies 
with constant velocities v lead to sim- 
ple arithmetic and algebraic computa- 
tions based on the fact that distance is 
velocity (speed) multiplied by time, 
that is, the elementary formula s = vt, 
where s is distance and t is time. How- 
ever, in nature as a rule we deal with 
motions whose velocities vary with 
time. Studies of such motions lead to 
two important physical concepts, dis- 
tance and velocity , as functions of time, 
and already at this stage there appear 
the two basic concepts of higher mathe- 
matics, the concept of the derivative 
and the concept of the integral. Start- 
ing with this chapter, we consider 
these concepts in great detail. 

The process of motion of a particle M 
consists in the fact that the z-coordinate 
of this point changes with time. The 
motion of the body (or point M) is 
determined by the dependence of z 
on £, that is, is characterized by the 
function z = z (t). Knowing this func- 
tion, we can find the position of the 
body at any moment in time. The func- 
tion z (£) may be represented graphical- 
ly by laying off time on the axis of 


abscissas (the t axis) and the distance 
from point M to point 0 on the axis of 
ordinates (the z axis). 

In uniform motion, with constant 
velocity v, the distance s covered by the 
body in time t is directly proportional 
to t, with the coefficient of proportion- 
ality ’being v 'jnere s = vt). "Denote 
by z 0 the coordinate of the body at 
time t = 0. The distance s covered in 
time t will then be equal to the dif- 
ference z (t) — z 0 . Thus. 

z (t)[= Zq + vt. (2.1.1) 

Hence, in uniform motion, the 
dependence of the coordinate z on time t 
is given by a linear function. The 
graph of the function z = z (t) in this 
case is a straight line in the £z-plane 
(Figure 2.1.1). 

In the case of nonuniform motion, the 
function z = z (t) is expressed by more 
involved formulas and the coorrespond- 
ing graph is some kind of curve. 
Let us analyze the following basic 
problem, given a function z — z (t), or 
the dependence of the coordinate of 
the body on the time variable, find 
the rate of motion (or velocity) v of 
the body. In the general case of nonuni- 
form motion, the velocity does not 
remain constant, it changes in the course 
of time. This means that the veloci- 
ty v is also a function of the time vari- 
able t, or v = v ( t ), and our problem 
consists in expressing this function in 
terms of the known function z = z (t). 



Figure 2.1.1 
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Everything is simple in the particu- 
lar case of uniform motion. The veloci- 
ty is defined as the distance covered in 
unit time. Let us find, for instance, the 
distance traversed in one second from 
time ti seconds to time t x + 1 seconds. 
This distance (equal numerically to 
the velocity) is equal to the difference 
between the coordinates z (t x + 1) and 
z (tj): 

z (t x + 1) — z (f x ) = [z 0 + v (f x +1)7 — 
— [z 0 + VtJ = V. 

Instead of this we can take an arbitrary 
instant of time between t x and t 2 and 
divide the distance traveled z 2 — z x by 
the magnitude of the interval t 2 — t x : 

Z 2 — Z 1 __ (gp + Vt 2 ) ( Z Q + ^l) _ (9 \ 9\ 

t 2 —h " h—h ' } 

It is precisely because the velocity is 
constant that we are able to take any 
interval t 2 — h to compute the veloci- 
ty, and the answer will be independent 
of both the time t x and the magnitude 
of the time interval, \t 2 — t x |. The 
situation is different in the case of 
a variable rate of motion. 

Before going over to the more general 
case, it will be convenient to change 
the notation. We will write t x — t and 
4-j Ai. 7 so- t.haJL the, di ff erprixo- 
t 2 — t x (the time interval) is denoted 
by A* (see Figure 2. 1.1). 21 Similarly, 
we write A z to denote the difference 

z (£ 2 ) “ z (£ x ) = z (t + At) — z (t) = 
= A z. 

In this notation, formula (2.1.2) can 
be rewritten thus: 

— (2.1.2a) 


2 - 1 Note that A is not a factor but a sym- 
bol, and A* is not the product of A by t , just 
as sin a is not the product of sin by a, and so A 
cannot be canceled from the numerator and 
denominator on the right-hand sides of 
(2.1.2a) and (2.1.3), in the same way as we 
cannot cancel sia from the numerator and 
denominator in the fraction sin cc/sin |3. A is 
the capital Greek letter delta; At is read 
“delta and A z is read “delta 2 ”; these are 
also spoken of as the increment, or change, 
in time and the increment, or change, in 
path (or distance). 



Figure 2.1.2 

In the general case of nonuniform 
motion, the right-hand side of (2.1.2a) 
yields the average velocity y av in the 
interval At between t and t + A£ (the 
size of the interval is therefore A£) is 

^ = 5- (2.1.3) 

We speak here of the average velocity 
because the velocity itself can change 
over the interval A*. 

Let us consider an example in which 
z ( t ) is given by a quadratic function 
instead of the linear function shown in 
Figure 2.1.1: 

z (t) = z 0 + bt + ct 2 . (2.1.4) 

Figure 2.1.2 illustrates a possible graph 
corresponding to a function of the 
form (2.1.4) ; Let us compute the. averr 
age velocity v ay over the interval At 
using the formula (2.1.3). In the case at 
hand we have 

z ( t ) = z 0 + bt + ct 2 y 
z (t + At) = z 0 + b (t + At) 
c (t At) 2 — Zq -f- bt -f- b At 
+ ct 2 + 2ctAt + c (A t) 2 

and, hence, 

Az = z (t + A£) — z (t) = bAt 
+ 2ctAt + c (At) 2 . 

From this we get 

A 7 

v& v = = b -f- 2ct -f- cAt, (2.1.5) 

Compare the results of (2.1.2a) and 
(2.1.5) for the average velocity when 
the motion obeys the law (2.1.1) and 
the law (2.1.4). The second example 
differs from the first in that here the 
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average velocity depends both on the 
time t and on the time interval A t. 
How can we find the instantaneous 
velocity at time t which depends solely 
on this moment in time? 

The velocity changes gradually, and so 
the smaller the time interval over which 
the distance traveled is measured, the 
smaller the change in velocity and, hence 
the closer will the average velocity 
be to the instantaneous value. Formu- 
la (2.1.5) for u aY contains two terms that 
do not depend on the size of A t and one 
term that is proportional to A t. For 
very small At the product cAt will also 
be small, whence the last term on the 
right-hand side of (2.1.5) can be ignored 
and the formula for v aY will transfer in- 
to a formula for the instantaneous 
velocity: 

= b + 2 c £. ( 2 . 1 . 6 ) 

For instance, suppose z 0 = 1, b = 2, 
and c = —1 in (2.1.4). We wish to 
find the velocity at t = 0. Equation 
(2.1.5) then yields 

u RY = 2 + 2 X ( — 1) X 0 + ( — 1) 

X At = 2 — At. 

We can construct the following table: 


to (from 0 to A t s) 1 0.5 0.1 0.05 0.01 

u aY 1 1.5 1.9 1.95 1.99 


Thus, the smaller the size of At, the 
closer v aY is to 2, and it is natural to 
assume that v lnst = 2. 

The attentive reader will have most 
likely recognized the expressions (2.1.4) 
and (2.1.6) from the school course of 
physics to be the formulas for uniform- 
ly accelerated motion: 

2 

2 {t) = z 0 + v 0 t + — , v(t) = v 0 +at. 

(2.1.7) 

All that is needed is to substitute for b 
in (2.1.4) the initial velocity v 0 (i.e. the 
velocity at time t = 0) and for c in 
(2.1.4) the “half-acceleration” a/2 (since 
a is the acceleration of the body). 


We have computed the instantaneous veloc- 
ity at time t on the basis of the average veloc- 
ity over the interval from t t© t + At. Now 
let us try to compute it by choosing the inter- 
val in a somewhat different way. We find the 
average velocity in the interval from t x = t — 
(3/4) At to t 2 = t + (1/4) At. As before, the 
duration of the interval is At. From formu- 
la (2.1.4) we get 

z{h) = z<>+b b 1- At) 

+ <(*-! a*) 2 , 

z ( h) = z 0 -{-b ^+-£-A«)+ c (h — ^-A i) 
and 

1 

z(t 2 ) — z (h)~ bAt-\-2ctAt - -^-c{At) 2 ‘. 

Whence it follows that 

»av = -i- cAt. 

( 2 . 1 . 8 ) 

Comparing (2.1.5) and (2.1.8), we see that 
the average velocities over the interval from t 
to t + At and over the interval from t — 
(3/4) A* to t + (1/4) At differ by c At [1 — 
(—1/2)] = (3/2) c At. But if we want to find 
the instantaneous velocity, we have to take a 
very small time interval At. Then the differ- 
ence between the two expressions for the 
average velocity will vanish and we again ob- 
tain for the instantaneous velocity iq nS f = 
b 2ct. However, the value of the average 
velocity over the interval from t — (3/4) At to 
t + (1/4) At given by formula (2.1.8) is closer 
to the instantaneous velocity (2.1.6) than the 
value of the average velocity over the interval 
from t to t + A* given by formula (2.1.5). 
This implies that (with greater precision) the 
choice of the interval from t to t + At is less 
convenient: here v aY is a worse approximation 
for i?i ns t. It is even better to select the interval 
in such a way that the value of t we are in- 
terested in lies at the center of the interval; 
in this case, for the quadratic dependence 
(2.1.4) between z and t , the value of v aY will 
coincide with iq ns t (see Exercise 2.1.2). We 
will dwell on this fact again in Section 6.1. 

We have considered the concept of 
instantaneous velocity for two specific 
cases: uniform and uniformly accelera- 
ted motion. In Section 2.3 we give 
a more exact definition of instanta- 
neous velocity for an arbitrary law of 
motion. 
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Exercises 

2 . 1 . 1 . The following dependence of the 
distance (z) traveled by a material particle M 
along a straight line on time (t) was observed: 
(a) z =t* + t, and (b) z = (1 + t*)~K What 
is the average velocity v aY of each motion over 
the time interval from t to A tl What is the 
instantaneous velocity iq ns t at time tl 

2.1.2. For the quadratic dependence of z 
on t given by formula (2.1.4), find v aY using 
the formula 



and compare the result with the value of iq ns t 
at time t. 

2.2* Specific Heat Capacity of an 
Object. Thermal Expansion 

Note that a procedure similar to the 
one discussed in the previous section is 
often encountered in many physical 
problems. We will demonstrate this 
using two simple examples and the know- 
ledge that a school physics course 
supplies. 

The reader will recall that the specific 
heat capacity of a substance is the quan- 
tity of heat (in joules) required to change 
the temperature of 1 kg of the sub- 
stance (water, iron, gold, etc.) by 1 °C. 
But for different initial temperatures 
the quantity of heat required to change 
the temperature of 1 kg of substance by 
1 °C is different, so that the specific 
heat capacity is a function of tempera- 
ture T, or c = c (T). For instance, to 
heat 1 kg of iron taken at a tempera- 
ture of 0 °C up to 1 °C we must supply 
440.857 J of heat, while to heat the 
same amount of iron taken at 50 °C up 
to 51 °C we must supply 470.583 J. 
Then how does one define the heat capac- 
ity of a body corresponding to a fixed 
temperature T ? 

The content of Section 2.1 suggests 
the following method for defining the 
quantity we are interested in, c — c (T) 
(the specific heat capacity at tempera- 
ture T). Here the quantity of heat Q 
(in joules) that must be delivered to the 
specified amount (1 kg) of the substance 
in order to heat the substance at the 


initial temperature (it is unimportant 
which initial temperature specifically) 
up to temperature T is the analog of 
the distance z of Section 2.1, and it is 
clear that this quantity depends on T, 
or Q — Q (T). Heating the 1 kg of the 
substance from temperature T 1 to tem- 
perature T 2 requires Q ( T x , T 2 ) = 
Q (T 2 ) — Q (7\) joules of heat, while 
heating the same amount of sub- 
stance from T to (T + AT) °C requires 
Q (T + AT) - Q ( T ) = A Q joules of 
heat. Therefore, the average specific 
heat capacity c av over the temperature 
interval from T to T + AT is natural- 
ly defined as 

Q(T + AT)-Q(T) A Q 
A T ~ AT ’ 

and the instantaneous specific heat ca- 
pacity c inst (the word “instantaneous” 
refers in this case not to a definite mo- 
ment in time but to a definite tempera- 
ture T) is defined as a value of the 
average specific heat capacity c RY over 
a very small temperature interval A T, 
and the narrower the interval the closer 
c aY will be to c inst . Note that in the 
majority of cases the value of 1 °G 
for AT is sufficiently small to yield an 
exact value of c = c (T). Here the ex- 
pression “sufficiently small” means that 
the value of c obtained in this manner 
will for all practical purposes coincide 
with the value at which we arrive by 
selecting a smaller (and even consider- 
ably smaller) interval AT of tempera- 
ture variation. 

Example 1 . The quantity of heat 
Q = Q (T) required to heat 1 kg of 
iron from 0 °C to T °G is given by the 
following empirical formula (which 
quite sufficiently represents the process 
we are interested in, at least for 
T < 200 °C): 

Q ( T ) - 440.857 T + 0.29725 T\ 

( 2 . 2 . 2 ) 

In full accordance with the material 
of Section 2.1 this yields (note that 
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the function (2.2.2) is quadratic, just However, this elongation refers not to a 
as (2.1.4) is) 1-cm rod but to a rod of length L, so 

A <9 440.857 (7+A7 7 )+0.29725 (7+A7 1 ) 2 P er unit ro( * we ^ ave 

= at (on the avera £ e ) 


440.8577 7 + 0. 29725 T 2 
AT 


440.857 


J_ 

L 


AL 
AT * 


(2.2.5) 


+ 0.59457 7 + 0.29725A7 7 (2.2.3) 

and, hence 

Cinst - 440.857 + 0.5945 T 7 . (2.2.4) 

We see that c (0) = 449.857 J/kg X 
°C, as we noted earlier, while, say, 
c (100) - 440.857 + 0.5945 X 100 - 
500.307 J/kg-°C. 

A similar situation arises in all prob- 
lems where the quantity we are inte- 
rested in is the rate of variation of an- 
other quantity. For instance, the coef- 
ficient k of (thermal) linear expansion 
is usually defined as the number that 
shows by what amount a thin rod made 
of the substance of interest (the length 
of the rod is assumed equal to 1 cm) 
changes its length when its tempera- 
ture is increased by 1 °C. Here too the 
number k is not a constant: it depends 
on the initial temperature T of the rod, 
whereby, strictly speaking, our defini- 
tion, which involves heating the rod 
from T to (T + 1) °G, is not quite cor- 
rect, since in the process of heating the 
quantity (or function) k = k ( T ) 
varies. 

Then how are we to determine 


The value of k av given by formula 

(2.2.5) is the approximate value 2 - 2 of 
the (instantaneous) value of the coeffi- 
cient of linear expansion k = k (T) of 
the substance we are interested in, and 
the smaller the temperature interval 
AT we select, the closer fc av will be to 

^inst* 

Example 2 . For platinum, within a 
broad temperature range, the function 
L = L ( T ) we are discussing here obeys 
fairly well the following empirical 
formula: 

L{T) = 1 + 8.806 X 10~ Q T + 

+ 1.95 X 10- 9 T 2 (2.2.6) 

(note that the dependence of L on T 
is again quadratic). We wish to express 
the coefficient of linear expansion 
k = k ( T ) of platinum as a function of 
T and find the values k (0) and k (1000). 

Answer . Just as we did above (check 
all the calculations yourself), we find 
that 


8.806 X10“ 6 + 3.90 Xl0- 9 f (0 0 ^ 

— 1_|_ 8.806 Xl0" 6 r+1. 95 XlO- 9 T 2 ’ ' * * ' 


&in S t? or the quantity k corresponding 
to a definite temperature 7 1 ? Let us take 
a rod 1 cm long at 0 °C and gradually 
heat it. The length of the rod, L, will 
change in the process and, hence, is a 
function of the rod’s temperature T, 
or L = L (T). If we heat the rod from 
temperature T to temperature T + 
AT, where the temperature incre- 
ment AT is assumed small, the length 
of the rod will increase by 

AL = L {T + AT) — L (T), 

so that the following average elonga- 
tion of the rod corresponds to an in- 
crease in temperature by 1 °G: 

L{T + AT) — L(T) _ AL 
AT ~ AT ' 


from which it follows that k (0) = 
8.806 X 10~ 6 (1/°C) and k (1000) = 
12.57 X lO" 6 (1/°G). 


Exercises 

2.2.1. It is known that the amount of heat 
Q ( T ) (joules) required to heat a 1-kg dia- 
mond (!) from 0 to T °G can be expressed fairly 

2 - 2 The approximate nature of formula 

(2.2.5) manifests itself in the fact that we 
relate the elongation AL (more precisely the 
specific elongation AL/ AT) to the length 
L = L (T) of the rod at the initial tempera- 
ture, although in reality the rod’s length 
changes continuously in the entire heating 
process. (Why should we divide AL/ AT by 
L (T) and not by L (T + AT)1) This inaccu- 
racy in formula (2.2.5) for k is unimportant. 
(Why?) 
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well by the following empirical formula (which 
is valid in the 0 to 800 °G temperature range): 

Q ( T ) = 0.3965T + 2.081 X KHZ’* 

—5.024 x io- ? r 3 . 

Find the average specific heat capacity c av of 
diamond over the interval from T to 
(T + AT) °G and the “instantaneous” specific 
heat capacity c = c (T) as a function of the 
temperature T. What are the values of c (0), 
c (100), and c (500) for diamond? 

2.2.2. The quantity of heat Q (T) required 
for heating 1 kg of water from 0 to T °G is 
fairly well represented by the following empi- 
rical formula: 

Q (T) = 41 86.68T + 8373.36 X 10 ~ b T 2 
+ 1256 x io - 6 r 3 . 

What is the specific heat capacity c (T) of 
water? Find the values of c (10) and c (90). 

2.2.3. Prove that (for every substance and 
at any temperature T) the coefficient of volume 
expansion (also known as the hulk expansion 
coefficient ), that is, the “instantaneous rate” of 
increase in volume of substance resulting from 
an increase in temperature per unit volume, 
is three times the coefficient of linear expansion. 


2.3 The Derivative of a Function. 

Simple Examples of Calculating 
Derivatives 

In Section 2.1 we considered the prob- 
lem of instantaneous velocity and exam- 
ined ratios of the form [z ( t 2 ) — 
z (^i)l /(^2 — *i), where the values 
t x and t 2 must be considered very close- 
lying. The expression “close-lying” is 
not exact, that is, mathematically it is 
not rigorous. The exact formulation is 
this. It is necessary to find the limit to 
which the ratio 

■ z( * 2) ~* (fl) (2.3.1) 

r 2 — ti 

tends as t 2 tends to t x . Using the desig- 
nations A t and A z, we can rewrite this 
ratio as 

= (2.3.2) 

and the condition t 2 t x now takes the 
form A t -*■ 0. 

In (2.3.2), the quantities At and A z 
are related: any time interval A t — 
t 2 — t± can be selected, but after 
‘’tne’ denominator* Ai nas neen selected, 


it is assumed that A z (the numerator) is 
not just any distance interval but pre- 
cisely that distance that corresponds to 
the time interval A t. This was obvious 
in formula (2.3.1) from the way the ra- 
tio was written; the numerator is the 
difference z (t 2 ) — z (^) of the values of 
the function z = z (t) at t 2 and at t x \ 
the right-hand side of (2.3.2) is simply 
an abbreviation of the ratio (2.3.1). 

Thus, the quantity that interests us, 
the instantaneous velocity y inst == 
v(t x ) at time t u is the limit of the 
ratio Az/At as A t tends to zero (where 
A t = t 2 — t x ). This statement can be 
written as 

v(t i )= lim Jf. 

At^O 1X1 

Here u ( t x ) is precisely the instantaneous 
velocity and lim stands for “limit”. 
The particular kind of limit we have in 
mind is indicated underneath lim — 
when A t approaches zero, and the arrow 
stands for “approaches”. The quantity 
to the right of lim is the one whose li- 
mit is being sought. The notation is 
read as follows: “the limit of A z over 
At with At approaching zero,” where the 
word “over” replaces the phrase “divid- 
ed by the corresponding value of.” 

What meaning do we attribute to the 
terms “limit” and “approaching the li- 
mit”? The calculations carried out in 
Section 2.1 served as an illustration of 
these notions. We saw that for small 
intervals At the value of i; av in Example 
2 in Section 2.1 differed from the value of 
i?inst by a quantity proportional to 
At. For small At this quantity is small, 
too: the smaller the A t, the smaller the 
quantity, and for infinitely small val- 
ues of At (i.e. so small that they can 
be assumed negligible against the back- 
ground of b and 2 ct in (2.1.5)) the 
quantity also becomes infinitely small. 
It is precisely for this reason that we 
can neglect the term with At in the ex- 
pression for z; av when At is small. 

Thus, the ratio 


Az 

At 


z(t 2 ) — z(ti) 




t 2 — 1 1 
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or, as we predominantly wrote in Sec- 
tion 2.1 by replacing t 1 with t and t 2 
with t + At, 


A z _ ^ (£ + At) — z ( t ) 
At “ At 


(2.3.3a) 


tends to a definite limit when At = 
t 2 — t x tends to zero. 2 - 8 The cor- 
responding limit is the instantaneous 
velocity v , which is also a function of 
t: 

lim ^ = v(t). (2.3.4) 

A*-0 

Why is that, when computing the 
velocity from the given formula z (t), 
we have to carry out so many calcula- 
tions and find Az for distinct At and 

A z 

only then find the limit lim -r— ? 

Az-0 

Couldn’t we simply take the value 
At = 0 from the very start? No, because 
in this case we would simply get Az = 0, 
since Az = z(t + At) — z(t ), and 
if At = 0, then Az — 0. By this thought- 
less mode of operations we would 
get Az/At = 0/0, which means we 
would get nothing definite. 

When computing velocity, the whole 
idea is to take small At and small Az 
that correspond to At . In this way, we 
obtain a very definite ratio Az! At each 
time. When At is reduced, (tends to ze- 
ro), then Az diminishes in approximate 
proportion to At, and so the ratio re- 
mains approximately constant, that is, 
the ratio Az/At approaches a definite 
limit when At tends to zero. 2,4 The 
magnitude of this limit is the instanta- 
neous velocity v ( t ) for the case where 
t is time and z is distance. 

The limit of the ratio of the incre- 
ment of the function to the increment 


2 - 3 We could have also started from the 

initial formula (2.3.3) for the ratio and assu- 

med that * 2 an d t x tend to a definite value of t , 

so that At — t 2 — t ± tends to zero (in partic- 

ular, see the text in small type at the end 
of Section 6.1). 

2 - 4 In some “unnatural” laws of motion 

z = z (t) this limit may not exist; the re- 
spective restrictions will be considered later. 
We advise the reader not to think, at least 
for the time being, about such exceptions. 


of the independent variable as the in- 
crement of the independent variable* 
tends to zero is of prime importance in* 
higher mathematics and its numerous 
applications. We have already seen, 
for example, that such an important 
concept as the (instantaneous) velocity 
of motion is found with the aid of the* 
limit of such a ratio. (In Section 
2.2 we reduced to a similar pro- 
cedure the problems of calculating the 
specific heat capacity and the coeffi- 
cient of linear expansion; think over 
these examples.) This is why the limit 
of this ratio has a special name: the de- 
rivative of the function or, simply, the 
derivative . This name is due to the fact 
that if z is a function of t, or z = z ( t ), 
then the limit (2.3.4) depends both on 
the type of function z = z (t) and on the 
value of t at which this limit is calcu- 
lated; in other words, this limit is also 
a function of t, a new function derived ' 
from the old function z (t). It is natural 
then to speak of this function as the de- 
rivative, where the word derivative 
stresses the dependence of this new func- 
tion on the basic function z = z (t). 

We have special notations for the de- 
rivative. One notation (differential no- 
tation) is 


dz f 

Hi l 


= Jim 

A<-0 


Az ' 

A 7, 


Here the quantity dz/dt (it is read 
u d z d t”) is not a fraction but is simply 
an abbreviated way of writing the lim- 
it on the right. The quantity dz/dt 
is written in the form of a fraction to- 
remind us that it is obtained from the* 
fraction Az/At by a passage to the 
limit. 2,5 

A different notation for derivatives is- 
the prime notation, v = z' (£), or, for 
example, for the function y = y (x), 


y' = y' (*) = -37= lim 


Ax-*0 


Ax 


2 - 5 * * Below (in Section 4.1) we will see that- 
the expression dz/dt (or dy/dx) can also be 
interpreted as a fraction; for the time beings 
however, the reader must interpret this nota- 
tion as simply a symbol. 
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Figure 2.3.1 


Occasionally, in place of the func- 
tion symbol, one gives the expression 
of the function: if z = at 2 + b, then 
instead of dzldt we can write directly 
d (at 2, + b)/dt or (at 2 + b ) f . 

Thus (and this is extremely impor- 
tant), the derivative of a function is de- 
fined as the limit of the ratio of the incre- 
ment of the function to the increment of 
the independent variable as the latter 
tends to zero : 


dy ,. A y 

—f— — lim ~ . 

d x Ax-* o ^ x 


(2.3.5) 


The instantaneous velocity of motion 
of a body is equal to the derivative of 
the body’s coordinate with respect to 
time. By analogy with this, when x 
is not time and y is not distance, we 
nevertheless speak of the derivative 
dyldx as the rate of change (or variation) 
of the function y under the variation of 
the indepedent variables. For instance, 
using the notations of Figure 2.3.1, 
we can say that the ratio Ay! Ax is the 
“average rate” of increase of the function 
y on the interval AB within which 
the independent variable x varies, 
while the limit of this ratio, with 
point B approaching point A, expresses 
the rate of growth of y at point A on 
the x axis. 

Let us now find algebraically the de- 
rivative of the function 


Next we remove the brackets in the nu- 
merator, 

A z= (t + to) 2 — t 2 = t 2 + 2tto 

+ (to) 2 - t 2 = 2tto + (to) 2 , 


and get 

A z _ 2tAt + (At) 2 
At ~ At 


2t + to. 


(2.3.7) 


Since the first term on the right-hand 
side of (2.3.7) is independent of At, 
where only one term is left that does 
not decrease as we send At to zero, 
and we can write 

dz d(t 2 ) A z 

dt dt a<-*0 A* 


-= lim (2t + At) = 2t. (2.3.6a) 

At + Q 

Let us consider another example: 
s = t\ (2.3.8) 

Here 

Az — (t-\- At) 3 — t 3 = t 3 -{- 3 t 2 to +3 1 (to) 2 
(to) 3 — t 3 = 3t 2 to + St (to) 2 + (At) 3 , 

-^ = 3* 2 + 3*A* + (A*) 2 , 

and 

dz __ d (t s ) 

~dt ~ dt 


= Hm [3t 2j i-StAt J r(At) 2 ]==St 2 , (2.3.8a) 

A^O 


The limit was readily found in these 
examples because At canceled out when 
we computed the ratio Az! At. Let us 
consider a more complicated example: 


(2.3.9) 


In this case we have 

1 i_ 

Az t 

~E t~ A t ' 


-2 w t 2 (2.3.6) 

(we have performed this operation in 
Section 2.1). Form the ratio 

Az _ (Z-f-Az) 2 — t 2 
~A T~ At * 


Can we disregard the quantity A£ 
in the first fraction, in the expression 
1 /(t + A^), when we pass to the limit? 
No, we cannot, because we have not 
yet canceled out the quantity A£ in 
the denominator. By substituting 1 It 
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for II (t + At) when A t is small, we in- 
troduce a small error (if At is small) in 
one of the summands of the numerator 
of the fraction Az/ At. But in this frac- 
tion both numerator and denominator 
are small if At is small. For this reason 
we cannot allow for a small error in the 
numerator (it transforms the numera- 
tor into zero). 

Here is the proper way to do this: 

A _ _ i 1_ __ t — U + 

t + At t ' t(t+At) 

At A z 1 

t (t + At) * At t(t + At) 

Now we can find the limit (the deriva- 
tive) by dropping At in the denomina- 
tor: 

d lz= __d m _ r l i 

dt dt Ai-0 L t(t + At) J f 2 * 

(2.3.9a) 

These examples illustrate a very im- 
portant property, the fundamental prop- 
erty of limits (this property is com- 
monly taken as the definition of a limit). 
As At is made smaller and smaller, the 
difference between the value of the ra- 
tio Az/At and the limit of this ratio 

(this limit is the derivative), lim = 

dz 

— , may be made as small as we please, 

which is to say, less than any given 
number. What is essential here is that 
the difference then remains smaller than 
the given number in the further process 
of variation of the ratio Az/At. 

An example will serve to illustrate 
this point. For z = lit we have 

dz 1 Az 1 

~dt~ W' ~A J = “ t (t + At) • 

Let us take, say, t = 2. Then dzldt = 
—0.25. Can we choose At such that 
Az/At differs from the limit, or —0.25, 
by less than 0.0025? What we mean is 
that At will have to be chosen such that 
Az / _ 1 1 \ 

At \~ 2 (2 + A t) ~ ““4 + 2A t J 

lies between— 0.25 + 0.0025 =—0.2475 
and —0.25 — 0.0025 = —0.2525. 


But this, as can easily be seen, is 
sure to be the case if At in absolute 
value is made smaller than 0.02 (and 
as At is made smaller than 0.02 
we will never encounter a value of the 
difference greater than 0.0025). 

The same goes for other functions as 
well: the approach to a limit as At ->0 
signifies the opportunity of choos- 
ing At such that any degree of close- 
ness to the limit is attainable (this 
limit is then never lost). 

Finding the derivative in the special 
case of z = t is particularly simple: 
quite obviously, Az = A£and Az/At = 1, 
that is, the ratio Az/At is equal to 
unity for arbitrary (large or small) 
At and hence in the limit as well. Thus, 

if z _ ( , then -g-=-£=l. (2.3.10) 

Finally, if we consider a constant, 
z = c, it can also be regarded as a spe- 
cial case of a function — the graph of 
this function is a straight line parallel 
to the x axis (see Figure 1.3.6). In this 
case clearly Az — 0 for all At, whereby 
the following rule holds true: 

if 2 -e, then •£■=-£ = 0. (2.3.11) 

2.4 Properties of Derivatives. 
Approximating the Values 
of a Function by Means 
of a Derivative 

Here are some general properties of de- 
rivatives. 

If a function is multiplied by a constant 
factor , then the derivative is multiplied 
by the same factor . For instance, 

if z = 3* 2 , then ~ 

dt d t 

~ 3 — 3 X = 6 f. 

In general, 

if z(t) = ay(t), then -§-=«+. 

(2.4.1) 

It is also obvious that the derivative 
of the sum of two functions is equal to 



76 


2 What is a Derivative ? 


the sum of the derivatives of the two func- 
tions: 

if z(t) = x ( t ) + y (t), 


whence ~ = , and so on. 

at at dt 

It is now easy to find the derivative 
of a polynomial . We already know that 


then f 

dt 


dx 

~dt 


dy 

dt 


(2.4.2) 


The last rule can easily be generalized 
so as to include the sum of three, four, 
and, in general, of any number of func- 
tions. 

Using the above two rules, we find 
that the derivative of a sum of several 
functions taken with constant (but, gen- 
erally speaking, different) coefficients 
is equal to the sum of the derivatives of 
these functions with the same coefficients : 


if z ( t ) = ax ( t ) + by ( t ) + cu ( t ) , 


then 


dz dx , , dy . du 

ir = a ^r + b ^r +c ~dF 


(2.4.3) 


dC rN dt 

~dT ~df' 


a d(t 2 ) _ 9 

1, dt 


dm 

dt 


: 3t 2 . 


This yields 

d (a -)- bt -{- c£ 2 -|- efi) da . i dt 

~di “ nr 0 ~dt 

-\-e^- = b^2ct+3et 2 . 


(2.4.4) 


c 


dt 2 
dt 


(2.4.5) 


For instance, in Section 2.1 we consid- 
ered the function z ( t ) = z 0 + bt + 
ct 2 (see formula (2.1.4)). In view of 
(2.3.5), where we must put a = z 0 and 
e = 0, we have 

V (t) = -~ = b + 2ct, 


Each of these rules is readily proved 
starting directly from the definition 
of the derivative: these rules hold true 
for the increment Az = z (t + At) — 
z ( t ) of the function z = z (t) (for 
any At), whereby they are valid for the 
ratio Az! At and for the limit of the ra- 
tio, or the derivative dz/dt. For in- 
stance, if z ( t ) = ay ( t ), with a constant, 
then 


Az = z (t + At) — z ( t ) = ay ( t 
+ At) — ay ( t ) = a [y (t + At) 
— y (01 = a &y 
and, hence, 


Az _ A y 
At a At 


for all A t, i.e. = a 

dt 


dy 
dt - 


If z (£) = x(t) -\-y (£), then 
Az = z (t + At) — z ( t ) 

— [% (t-)rAt)-\-y (£+A£)] — [x ( t)-\-y (£)] 
= [x(t + At) — x (^] 

+ [y {t + At)—y(t)] = Ax + Ay 
and 

Az Ax -{-Ay Ax , Ay 

”A T~ At — At ' "A 1 ’ 


which is the result we arrived at in 
Section 2.1 without turning to the 
general properties of derivatives. 

The technique for finding derivatives 
(also called the differentiation of func- 
tions) is given in detail at the beginning 
of Chapter 4. 

Running ahead a bit, we may point 
out that finding derivatives of functions 
given by formulas is a relatively simple 
job, much easier, say, than the solu- 
tion of algebraic equations. The for- 
mulas for the derivatives of functions 
are often even simpler (and never more 
complicated) than the formulas defining 
the functions. For instance, if the func- 
tion is a polynomial, its derivative is 
also a polynomial, and the new polyno- 
mial is simpler than the initial polyno- 
mial in the sense that its degree is low- 
er (see the above example of a third- 
degree polynomial, whose derivative 
proved to be a quadratic function of the 
independent variable; a similar situa- 
tion arises for polynomials of other 
degrees). If the function is an algebraic 
fraction, then the derivative is also a 
fraction. If the function contains roots 
of fractional powers, then the derivative 
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also contains them. The derivatives 
of trigonometric functions are also tri- 
gonometric functions, and in some ca- 
ses (the logarithmic function or inverse 
trigonometric functions, for instance) 
the derivative proves to be a simpler 
function (an algebraic fraction for the 
logarithm and the arctangent). 

Finding derivatives does not require 
any kind of special ingenuity or imagi- 
nation. The problem is always solved 
in a neat fashion through the use of 
the simple rules given above. As we 
have already said, other rules and ex- 
amples of application will be given in 
Chapter 4. 

So far, all the functions we have con- 
sidered are defined by formulas. One 
should not think, however, that this is 
absolutely necessary for the existence of 
a derivative. For example, we can re- 
gard the dependence of distance cov- 
ered upon time as having been found 
from experiments, in the form of very 
extensive tables. It is clearly possible, 
using these tables, to compute the in- 
stantaneous velocity (i.e. the derivative) 

by forming the ratios 4r = — / ~~~ 

for various A t and observing how the ra- 
tios behave when we send At to zero. 
Of course, here we cannot speak of the 
limit of the ratio Az/At with At ap- 
proaching zero, since for a function spec- 
ified by a table there is no way in which 
wa cam makn At. as. small as. desired 
{At cannot be made smaller than the 
step interval of the table), whereby we 
^cannot find the exact value of the deriv- 
ative. In the majority of cases, howe- 
ver, tables yield a sufficiently good esti- 
mate for the derivative dzldt, which 
•exists for all “good,” or “well-beha- 
ved” (or “smooth”), functions, regardless 
of the origin of the functions and the 
way in which the functions are specified. 
We note also that the “arithmetic cal- 
culation of the derivative,” or finding 
the values of the ratio AzIAt (or, for 
the function y = / (#), the ratio Ay! Ax) 
for a number of decreasing values 
of At (or Ax) and observing the beha- 
vior of the values of the ratio , is simplified 


considerably if one employs a pocket 
calculator. 

The derivative dzldt is defined as the 
limit of the ratio of the increments 
AzIAt as At ->0. When At is not equal 
to zero, the ratio of the increments 
Az/At is not equal in general to the de- 
rivative dzldt , but this ratio is ap- 
proximately equal to dzldt and the appro- 
ximation is the better the smaller 
At is. Therefore, we can write the ap- 
proximate relationship 


A z dz , 

/x -' — — z 

At dt 


(t), 


A z ~ A t = z' ( t ) At. (2.4.6) 


From this we can find the approximate 
value of the function z (t + At): 

z [t 4- At) = z ( t ) + Az ~ z {t) + -^-At 

= z(i)+z'(f) At. (2.4.7) 

Note that in (2.4.7) the first equality 
sign is exact in accord with the defini- 
tion of Az and the second one denotes 
approximate equality. 

Let us now return to the designations 
t 2 = t At and t x — t used earlier. 
We can then rewrite Eq. (2.4.7) as 

z (£j) ~ z (t x ) + z'(^) ( t 2 — tj. 

(2.4.7a) 

Thus,, given a small difference to-t^, 
that is, t 2 is close to t u the function 
z (t 2 ) can be expressed by an approxi- 
mate formula involving the value of 
the function z ( t ) and its derivative 
z'(t) at t = t v Note that this formula 
is linear in t 2 (to the first power). 

Formula (2.4.7) (we will return to 
this formula in Section 4.1) is extreme- 
ly important, whereby we discuss it in 
detail. Let us take an example. Sup- 
pose v = z 3 , and for the sake of clarity 
we assume that we are speaking of a 
cube of volume v and edge x. The deriv- 
ative dv/dx, as we know, is equal to 
3x 2 , with the result that (2.4.7) is 

v ( x + Ax) = (x + A#) 3 ~ x 3 + 3x 2 Ax, 
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that is, 

Ay = v (x + Aar) — v (x) ~ 3x 2 Ax, 

(2.4.8) 

while the exact expression for v (x + 
Ax) is, in view of the well-known 
formula, 

v (x + Ax) = (x + Ax ) 3 
= x 3 + 3x 2 Ax + 3x (Ax) 2 + (Aa;) 3 , 

whence 

Av = 3x 2 Ax + 3x (Ax) 2 + (Ax) 3 . 

(2.4.8a) 

Suppose that the edge is 1 m long. Then 
the cube’s volume is, obviously, 1 m 3 . 
But what can we say about the vol- 
ume if it is found that in measuring the 
edge length we introduced an error, 
that the edge is somewhat longer than 

1 m, say, by 1 or 5 mm or even by 1 or 

2 cm? 

If our error amounts to 1 mm, then 
actually x = 1.001 m, and the initial 
value of the volume v = 1 must be 
increased by Av given by formula 
(2.4.8a). Let us estimate the contribu- 
tion of each term on the right-hand side 
of (2.4.8a): 

3* 2 A;r = 3xl 2 X 0.001 - 0.003, 

3x (Ax) 3 = 3 X 1 X (0.001) 2 = 

= 0.000003, 

(A*) 3 - (0.001) 3 - 0.000000001; 
here 

v (1.001) = 1.003003001. 

Az; - 0.003003001. (2.4.9) 

But is there any sense in writing out 
all the digits in y? The third term on the 
right-hand side of (2.4.8a) is smaller 
than the first term by a factor of 

3 000 000 (three million!), whereby re- 
taining the third term in this case can- 
not be justified. The second term is al- 
so smaller than the first term, by a fac- 
tor of 1000, so that the precision it 
provides for the final result is illusory, 
since if the error in measuring the cube’s 


edge is not precisely 1 mm but 1.1 mm 
(which, of course, is quite a realistic 
assumption), the first term on the right- 
hand side of (2.4.8a) will be equal to 
0.0033 instead of the initial value of 
Tf^Uh, anh a'ii hrgfrsTn wiJihje- 

come unreliable. So is there any rea- 
son to perform unnecessary work and 
clutter up the calculations with unnec- 
essary figures? 

A similar picture arises at other va- 
lues of Ax (values that can still be con- 
sidered small). For instance, at 
Ax = 0.005 m ( = 1/2 cm) we have 

3x 2 Ax = 3xl 2 X 0.005 = 0.015, 

3x (Ax) 2 = 3 X 1 X (0.005) 2 
= 0.000075, 

(A*) 3 = (0.005) 3 = 0.000000125. 

Thus, the final result is 
y (1.005) = (1.005) 3 = 1.015075125, 

where, of course, we can also confine our- 
selves (and even more, this is necessary, 
as a rule) to the approximation v 
1.015. 

The results of numerical calculations 
associated with this example are listed 
in detail in the table below, from which 
the reader can see that even at Ax — 
0.02 m — 2 cm, or with low require- 
ments concerning the accuracy of the 
result and at Ax — 0.05 m = 5 cm, 
replacing (x + Ax) 3 with x 3 + 3x 2 Ax 
is quite admissible: 


x = 1 + Ax 

1+0= 

1 1.001 

1.005 

1.01 


v = (1 + A#) 3 

1 

1.003003 1.015075 1.0303 

1+3AZ 

1 

1.003 

1.015 

1.03 



x = 1 + A# 

1.02 

1.05 

1.1 

1.5 

2 

i> = (l + A;r) 3 

1.061? 1.1576 

1 .3310 

3.375 

8 

1+3A* 

1.06 

1.15 

1.30 

2.50 

4 


Another example is y = Y x (say, 
y is the side of a square plate with 
area or mass 2 * 6 x). We find the values 


2 - 6 The mass m of a homogeneous square 
plate whose side is y is equal to pz/ 2 , where p is 
a constant (the plate’s density); the function 
m — p y 2 differs from x = y 2 by an unim- 
portant factor p (in this connection see Sec- 
tion 1.7). 
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of the function y for x close to 4. In 
this case y (4) = ]/ 4 = 2. The deriv- 
ative z/' (x) = 1/2]/ x (see Exercise 
2.4.2), whereby z/' (4) = 1/2]/ 4 = 1/4, 
and the approximate formula (2.4.7) 
is of the form 

y (x) == ]/ 4 + A;r ~ 2 -f- 0.25A;r. 


We again compare the approximate 
and exact values of y (x): 


x— 4 + Ax 

4 4.1 


4.5 

5 

y=Y 4+ A* 

2 2.02485 

2.1213 

2.24 

2+0.25A;r 

2 2.025 

2.125 

2.25 

. 

x— 4 + Ax 

6 

7 

8 

9 

y= Y 4 + A:r 

2.45 

2.65 

2.83 

3 

2 + 0.25A;r 

2.50 

2.75 

3.0 

3.25 

Let us now return to 

our 

basic 

exam- 


pie concerning distance, time, and ve- 
locity, that is, we assume that t is the 
time, z (t) is the distance covered in 
time f, z (t) = dz/dt is the instantane- 
ous velocity, and A z is the increment 
in distance, that is, the distance cove- 
red during the small time interval A t. 
The formula 

A z ~ z' ( t ) At (2.4.10) 

then signifies that the distance covered 
is equal to the product of the instanta- 
neous velocity and the time interval. 
But the instantaneous velocity itself 
varies with time. Therefore, the appro- 
ximation expression (2.4.10) is true 
only when the instantaneous velocity 
does not perceptibly change during 
the time At. Hence, the faster z' (t) 
varies, the smaller At must be taken in 

(2.4.10) ; and conversely, the slower 
z (t) varies, the larger At may be taken. 
That is to say, the magnitude of the 
increment At for which formula 

(2.4.10) still yields a small error de- 
pends on the rate of change of the deriv- 
ative over the interval At. Of course, 
the same holds for an arbitrary func- 
tion y — f (x) of an arbitrary inde- 


pendent variable x , where the derivative 
dyldx may still be considered as the- 
rate of variation of function y at a given 
value of independent variable x (see 
Figure 2.3.1 and the text referring to 
this figure). 

The cases we have examined (where 
we denote by t the independent vari- 
able and by z the value of the function 
and assume that t is the time and z- 
is the distance covered) confirm this 
conclusion. In the first example, z — 
t 3 , when t varies from 1 to 2 (at 
At — 1), the derivative z'(t) — 3 1 2 var- 
ies from 3 to 12 (which is to say, by a 
factor of 4). In the second example, 
z = ]/£, when t varies from 4 to 9* 
(At = 5), the derivative z ( t ) = 1/2}/ 1 
varies from 0.25 to 0.167 (or roughly by 
30%). Therefore, in the latter instance 
formula (2.4.7) or (2.4.10) yields a good 
result for larger values of At; for exam- 
ple^! At = 4, which constitutes a 100% 
increase against the initial value of t , 
the error introduced by this formula 
constitutes only 6% of the true value 
of the function, while at At = 5, which 
constitutes a 125% increase against 
the initial value of t, the error consti- 
tutes 1/12 ^ 8% of the true value. On 
the other hand, if we turn to the func- 
tion z — t 3 , we see that (2.4.7) be- 
comes invalid already at At = 0.5, when 
the error constitutes 25% of the true 
result (at At = l the error constitutes 
a 100% of the true result). Of course, 
what we have said here is valid for 
both positive and negative values of 
At (see Exercise 2.4.5). 

A detailed discussion of the range of 
application of formula (2.4.7), of the 
question of estimating the error intro- 
duced by this formula, and of the vari- 
ous refinements is given in Chapter 6. 

Anticipating the first sections in 
Chapter 6, we note that for small differ- 
ences t 2 — t ± the function z (t 2 ) can 
be represented as 

z (t^) = z (tj a (t 2 — £ x ) + 

b (t 2 — tj) 2 + c (t 2 — £i) 3 

(2.4.11) 
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and this relationship can be considered 
«exact if we assume that the sum on the 
right-hand side consists of an infinite 
number of summands (the exact mean- 
ing of this assertion will be discussed 
in Chapter 6). Since we assume that 
t 2 — tj is small, each term in (2.4.11) 
is smaller than the preceding one, since 
it contains a higher power of the small 
difference t 2 — t v The first two terms of 
the general formula (2.4.11) coincide 
with the right-hand side of the approxi- 
mate formula (2.4.7a), since a = z' (t-f) 
= dz (tj/dt. The difference between the 
approximate formula and the exact one 
lies in the fact that the latter contains 
-a term proportional to ( t 2 — tf) 2 and 
terms with higher powers of this small 
♦quantity. 

The common terminology here is as 
follows: t 2 — t x and a ( t 2 — £ x ) are said 
to be (provided that t 2 — = A t is 

small) quantities of the first order of 
.smallness (here and in what follows we 
assume that t 2 tends to t 1 and A t tends 
to zero); the quantities ( t 2 — tff and 
b (t 2 — £ x ) 2 are said to be quantities of 
the second order of smallness ; ( t 2 — tff 
and c (t 2 — £ x ) 3 are of the third order of 
.! smallness ; etc. With this in mind, we 
>can say that (2.4.7a) is exact to within 
first-order terms, while formula (2.4.7a) 
is exact to within second-order terms 
(the word “smallness” is usually 
dropped). 

If we employ (2.4.7a) to derive an 
approximate expression for the deriv- 
ative, we will be forced to divide both 
sides of (2.4.7a) by t 2 — - f lt which 
reduces the order of smallness by one. 
The equality 

Z ' (t ) ~ 

h — 1 1 

is approximate, and the error introdu- 
ced by it is not of the second but of the 
first order of smallness, that is, the er- 
ror is proportional to t 2 — h; neverthe- 
less, as t 2 that is, as A£ = t 2 — t x 

->0, the error tends to zero. This 
fact follows from the definition of the 
derivative given above. 


Exercises 

2.4.1. Find the derivatives of the follo- 
wing functions: (a) y = z 4 , (b) y — 4a: 3 — 

+ 2x — 1, (c) y = (2x + l) 2 , (d) y = 1 lx 2 , 
(e) y = a (x + i/x), and (f) y = ax 2 + b/x 2 . 

2.4.2. Prove that the derivative of y = 
V x is equal to l/2y x. [Hint. Multiply 
the numerator and denominator ^f 
(V* + Ax — y x)l Ax by the sum y x + Ax 
+Y x. ] 

2.4.3. Prove that the derivative of y = 
y x 3 is equal to (3/2) y x. [Hint. Use the 
method employed in Exercise 2.4.2.] 

2.4.4. Find (1.2) 2 , (1.1)2, (1.05) 2 , and 

(1.01) 2 using formula (2.4.7). Compare the 
results with the exact values. 

2.4.5. Using the expression for the deri- 
vative of the function z (t) = 2 + 20 1 — 5 1 2 , 
find z (1.1), z (1.05), and 2 (0.98). Compare the 
results with the exact values. [Hint. In the 
last case take t — 1 and At = —0.02.] 

2.5 A Tangent to a Curve 

Using the concept of a derivative, we 
can solve an important geometric pro- 
blem: to find the tangent line to a cur- 
ve given by the equation y — f (x). 
The coordinates of the point A of tan- 
gency are given: x = x 0 and y = 
Uo = f (*o)- 

From the viewpoint of analytic (co- 
ordinate) geometry introduced by Des- 
cartes (see Chapter 1), to find the tan- 
gent line means to find the equation of 
the line. It is clear that the tangent 
line is one of the straight lines passing 
through the point of tangency. But the 
equation of any straight line passing 
through a given point A (x 0 , y 0 ) can 
be written as y — y 0 = k (x — x 0 ) (see 
formula (1.3.3)). In order to find the 
equation of the tangent line, it remains 
to determine the quantity &, the slope 
of the tangent line (its “steepness”). 
To do this, we first find the slope of the 
line passing through two given points 
A and B of the curve at hand (Figure 
2.5.1). We call this line a secant line . 
When these two points of the curve ap- 
proach each other, the line approaches 
the tangent line. 

In Figure 2.5.1 we see two secant 
lines through points A and B and 
through points A and C, with C lying 
closer to A than B. The closer point B 
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Figure 2.5.1 

or C is to A, the closer the secant line 
AB or AC is to the tangent line. There- 
fore, the slope of the tangent line is 
equal to the limit approached by the 
slope of the secant line as the distance be- 
tween the two points of intersection of 
the secant and the curve tends to zero 
(or the points of intersection approach 
each other). This fact can be taken as 
the definition of the tangent line; of 
course, it would be more precise to say 
that point B tends to point A or that 
the two points B and C tend to A 
and the secant line BC tends to the 
tangent line at the chosen point A. 

The slope k s of a secant line can easi- 
ly be expressed in terms of the coordi- 
nates of the intersection points. For 
one of the points of intersection of the 
secant and the curve we take the point 
A (x 0 , y 0 ) at which we desire to draw a 
tangent line to the curve; we denote by 
x x and y x the coordinates of the second 
point of intersection, B. Since both 
points lie on the curve whose equation 
is y = f [x) (“belong to this curve” is 
usual phrase, it follows that y 0 = 
f (x 0 ) and y 1 = f (^i). As can be 
seen from Figure 2.5.2 the slope of the 
secant line is 


k s == tan a 


yi — yo 

Xi — X 0 


f( x i) — f( x o) 

X 1 — X P 


(see Section 1.3). 

In order to obtain the slope of the 
tangent line at the point (x 0 , y 0 ), we 
must take point B closer and closer to 
A, which means that x 1 must approach 
z 0 . Consequently, the slope k of the 



Figure 2.5.2 

tangent line is equal to the limit of 
k s as x x tends to x 0 : 

k = lim k s = lim - L < 

xi-*xn xi -*xn Xl X() 


We denote the difference x 1 — x 0 
by Ax; then x 1 = x 0 + Ax and A / = 

/ (*l) — / (*o) =/(*<> + A *) — / (*<>)• 
In this notation, the slopes k s and 
k of the secant and tangent lines, 
respectively, are given by the for- 
mulas 


A/ 

Ax 


k= lim 

Ax-»0 


A/_ 
Ax * 


Thus, the slope of the tangent line is 
equal to the derivative of the function 
f (x) at point x 0 : 

k=zJ t = f'(Xo). (2.5.1) 

We know that the derivative /' {x) 
of a function / {x) is itself a function of 
x. Since we sought the slope of the tan- 
gent line at a fixed point A (x 0 , i/ 0 ), 
we assumed, in computing the limit of 
Afi Ax, that x = x 0 is fixed. That is 
why in the final formula we have /' (x 0 ), 
which is the value of the derivative 
f(x) at x = x 0 . On the other hand, the 
function /'(#), obviously, expresses the 
slope of a tangent line at a variable 
point (x, y) or (x, / (x)) of this curve. 

Let us consider the example of a 
parabola y = a; 2 , that is, / (x) = x 2 . 
We set up the equation of the tangent 
line at the point x 0 = 2, y 0 = / (x 0 ) = 
4. 
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Figure 2.5.3 



We know the derivative of x 2 : 



dx* 

dx 


— 2x 


(cf. (2.3.6a)). Consequently, at the 
point of interest the slope of the tan- 
gent line is 

k — / (^Tq) 

while the equation of the tangent line 
is (Figure 2.5.3) 

y — y 0 = k (x — x 0 ), i.e. 
y — 4 = 4 (x — 2), 


or 

y = — 4. 

In general, the slope of the tangent 
line to the parabola y — x 2 at point 
(x 0 , a:®) is equal to 2x 0 (since here 
dyldx = 2a;); hence, the equation of 
the tangent line is 

y — x\ = 2x 0 (x — x 0 ), or y = 2x 0 x — a%. 

(2.5.2) 

We have thus arrived at the result estab- 
lished in Section 1.4 by a different 
method, namely, that the straight line 
y = kx + b touches the parabola y = 
x 2 if and only if the coefficients k 
and b in the equation of the straight 
line can be represented as 2x 0 and — x\, 
which depend on the parameter x 0 ; 
in other words, the condition for a 
straight line y = kx + b touching a para- 


bola y = x 2 can be expressed as 
b = —(hi 2) 2 (cf. (1.4.8a)). 

Here is another example. Take the 
semicubical parabola y — ]/ x 3 (see 
Section 1.5). Here y f = (3/2)]/a; (see 
Exercise 2.4.3), whence k = f (x 0 ) = 
(3/2 )Y x 0 . In particular, at the ori- 
gin the slope of the tangent line is 
k = /'( 0) = (3/2) "j/0 = 0; whence, at 
point 0 the semicubical parabola touches 
the x axis, as depicted in Figure 
1.5.7. 

Without the aid of derivatives, it is rather 
difficult to draw a tangent line to a curve giv- 
en by the equation y = f (x): you have to 
compute a large number of points of the curve, 
then, using a French curve, draw the curve 
through these points and, by eye, apply a ruler 
to the curve at the given point and pay special 
attention to see that you do not intersect the 
curve near the point of tangency. Using deriv- 
atives, we find the equation of the tangent 
line, then from this equation we find two 
points lying on the straight line given by this 
equation, and then we draw the straight line 
(tangent line) with a ruler (through the two 
points). For one of the two points it is natu- 
ral to take the point of tangency itself, 

A (# 0 , y 0 ). The second point C may be taken 
on the straight line at a good distance from A 
(the farther the better); we can then more 
accurately determine the slope and the posi- 
tion of the tangent as the straight line passing 
through the two points A and C. 

For example, above we found the equation 
of a straight line tangent to the parabola 
y = x 2 at the point x 0 = 2, y 0 = 4. It has the 
form y = Ax — 4. Let us find the coordinates 
* bi"rvvo~ pmlris" uu hint* 1 'rurer a i x — ° ^"wdmfiu 
that y = 4. This is the point of tangency 
A (2, 4); the coordinates need not have been 
computed since the tangent must pass through 
it. For the second point, C, we choose the 
point of intersection of the tangent line and 
the y axis. Putting x = 0, we find y = — 4, so 
that C = C (0, —4) (see Figure 2.5.3). 

Note the curious fact that for x — 0, y — 
— y o the point C of intersection of the tangent 
line with the y axis lies below the x axis fust as 
much as the point of tangency itself lies above 
the x axis. This is not accidental. The rule 
holds true for all tangents to quadratic para- 
bolas with an equation y — ax 2 (with a > 0). 
Indeed, if the tangent is drawn to the point 
A(xq, y 0 = ax%), its equation is 

y — y 0 = 2 ax 0 (x — x 0 ) (2.5.2a) 

(cf. (2.5.2)), and for x = 0 we get y — y 0 == 
—2 ax\, or y = y 0 — 2 ax% = y 0 — 2 y 0 = —y 0 . 
Thus, the tangent passes through the points 
A x 0 , y 0 = axl) and C (0, y = — y 0 = —axl). 



2.5 A Tangent to a Curxe 


83 


When plotting a curve by points it 
is hard to construct the curve if there 
are few points. Using derivatives, you 
can draw the tangents to the curve at 
these points beforehand, and then the 
curve itself can be drawn with greater 
ease and accuracy. 

Pictorially, it is clear that the tan- 
gent is horizontal at the points of 
maximum and minimum ; in future 
we will return to this aspect many 
times. The equation of a horizontal stra- 
ight line is y = constant, and the slope 
of the horizontal straight line is k = 0. 

Hence, we arrive at the following 
theorem: the derivative of a function 
y = / (x) ( the graph of which is a curve) 
is zero at the points of minimum and 
maximum of the curve . 

Using this theorem, one can find 
the ^-coordinates of the points of max- 
imum and minimum of the curve. The 
respective {/-coordinates can then be 
easily found by substituting the estab- 
lished values of x into the equation 
of the curve. It is also obvious that know- 
ing the coordinates of the points of 
maximum and minimum we can draw 
the curve itself more accurately. 

It is a useful exercise to draw a curve 
y (x) freehand and then, at least ap- 
proximately but rapidly, draw the curve 

7/ ( \ju) , 'iiuVii/g kli/b ‘oYgi-L 'vft ’’(j “oujuL 

the points where y'(x) vanishes. This 
is illustrated in Figure 2.5.4a (the 
graph of y (x)) and Figure 2.5.46 (the 
graph of the derivative y'{x)). 

For the derivative y' (x), the points 
at which y (x) vanishes are of no inter- 
est. If the curve y (x) is moved upward 
or downward (in Figure 2.5.4a the curve 



has been moved upward and the result 
is depicted by the dashed curve) by an 
arbitrary segment 6, the curve y'(x) 
does not change in any way because 
in translation in the vertical direction 
all the slopes remain the same (to be 
precise, we must speak of the slopes of 
the tangent lines to this curve at all 
points: compare the upper and lower 
curves in Figure 2.5.4a and, in partic- 
ular, the tangents to these curves at 
points A and B ). This result is in ac- 
cord with the property of derivatives: 
the graphs of the functions y = f (x) 
and y 1 = / (x) + b (the graphs of these 
functions are obtained through trans- 
lation upward or downward) have equ- 
al derivatives at corresponding points. 

Another mathematical game is this: 
draw freehand the graph of the deriva- 
tive and then give a rough construction 
of the graph of the function. Here you 
have to specify in arbitrary fashion one 
point (x 0 , y (^ 0 )) on the curve and then 
draw the graph up or down (depending 
on the sign of the derivative). 

In conclusion we must dwell on one 
important point closely related, as we 
will see below, with the question dis- 
cussed concerning the connection be- 
tween the slope of a tangent line to a 
curve and the derivative (above we ig- 
nan%4x \ki4b ’ptm/u ^purpub^f . IWifc 

in physical problems, the quantities 
x and f (, x ) are usually dimensional 
(for example, x or t is time and y = 
/ {x) or z = f ( t ) is distance). But 
this means that dyldx is dimensional, 
too. Clearly, when z is measured in cen- 
timeters and t in seconds, dzldt = v (£) 
is the velocity, with dimensions cm/s. 
If, say, y is expressed in kilograms and 
x in months (say, y = f (x) represents 
the weight of a baby, or its mass, as a 
function of time), the derivative 
dyldx ( = lim Ay/ Ax) expresses the 
rate of weight gain by the baby and has 
the dimensions kg/month. If y is the 
current in a conductor and x is the con- 
ductor’s resistance, then the derivative 
y' = dyldx is measured in A/£2. If x 
is the edge of a cube and y its volume, 
then dyldx has the dimensions of 


Figure 2.5.4 
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cm 3 /cm = cm 2 . Examples of this type 
abound. In general, if the independent 
variable x is measured in units of e 1 
and the function y = f (x) in units 
of e 2 , then the derivative dyldx has the 
dimensions of eje 

But, as we know, the trigonometric 
function tana is dimensionless (being 
equal to the ratio of the lengths of two 
line segments). Therefore we cannot 
simply write dyldx — tana since the 
left and right members have different 
dimensions. Only in the rare instances 
when the two quantities, x and y , have 
the same dimensions (say, when we are 
studying the speed of a yacht as a func- 
tion of the speed of wind) or both are 
dimensionless (say, the function y = 
sin x , where x is measured in radi- 
ans) does the relationship dyldx = 
tana have a meaning. 

How can we get round this difficul- 
ty? Note that above we assumed the 
scales along the x and y axes to be the 
same, that is, that the unit of measure- 
ment for x and the unit of measurement 
for y are depicted by the same segments; 
this assumption seemed so natural that 
we did’nt even spell it out. But in the 
case where x and y have different di- 
mensions (and this is the general case) 
the above assumption is not only unnat- 
ural but really has no meaning, since 
there is no way in which we can assume 
that 1 s = 1 cm. Therefore, in reality 
the scales along the two axes are differ- 
ent, and because of this we cannot write 
dyldx = tana. 

Suppose that y is the distance trav- 
eled andx is the time. We construct the 
graph of the position of a body depen- 
ding on the time, y = y (x). On the axis 
of ordinates we lay off y using the scale: 

1 meter of distance equals 1 cm in 
the drawing. On the axis of abscissas we 
lay off time using the scale: 1 second of 
time equals 1 cm in the drawing. Then 
the velocity v expressed in meters per 
second and equal to the derivative 
dyldx will indeed be equal to tan a, 
the tangent of the angle formed by the 
tangent line and the x axis. But if we 
choose a different scale for x say, 1 s = 


l cm in the drawing, we get 


If l = 5, we have tan a = (1/5) dyldx = 
(1/5) y. 

In the general case, if one x unit in 
the drawing is laid off to a scale of l x 
cm and one y unit is laid off to a scale 
of Z 2 cm, then 

tan a = -£- -t*-. (2.5.3) 

1 1 dx v 

The scale factors Z x and l 2 in this for- 
mula improve the situation — they 
make the formula proper from the stand- 
point of dimensions. Thus, in the exam- 
ple with a baby’s weight, Z x has the di- 
mensions of cm/month (1 cm in the dra- 
wing per month of age) and Z 2 has the 
dimensions of cm/kg (1 cm on the graph 
per kilogram of weight), so that 
(Z 2 /Z x ) dyldx is dimensionless: in the 
formula all the dimensions cancel out 
and the formula becomes meaningful 
(and correct). 

All this should be borne in mind when 
comparing the derivative and the slope 
of a curve representing the basic func- 
tion (i.e. the slope of the tangent line 
to the graph at the point of interest). 


Exercises 

2.5.1. Construct the graph of the function 
y — x 2 + 1 within the range from x = — l.n 
to x = 2.5 and draw the tangent lines at the 
points x = —1, x = 0, x = 1, and x = 2. 

2.5.2. Construct the graph of the function 
y = x s — 3x 2 , — 1 < x <z 3.5; draw the tan- 
gent lines at x = — 1, 0, 3. Find the points 
with horizontal tangent lines. 

2.5.3. On the curve y = x 3 — x + 1 find 
points having horizontal tangents. Construct 
the curve for —2 < x < 2. [Hint. In Exer- 
cises 2.5.1 to 2.5.3 it .is advisable to use graph 
paper and a large scale.] 
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2.5.4. Construct (freehand) the curve y' (x) 
for the function y (x) depicted in Figure 2.5.5. 

2.5.5. The graph of a derivative y'(x) is 
depicted in Figure 2.5.6. Construct (freehand) 
the graph of the function y (x) passing through 
the point (5, 0). At what angle will y (x) in- 
tersect the axis of ordinates? At what angle 
will y (x) intersect the axis of abscissas at 
point x = 5? [Hint. In Exercises 2.5.4 and 

2.5.5, first copy Figures 2.5.5 and 2.5.6 on a 
fresh sheet of paper and then construct the 
respective graphs; it is advisable to construct 
these graphs on the same sheet of paper with 
the initial graphs, say, strictly below the re- 
spective graph.] 

2.5.6. Set up the equations of the tangent 
lines to the cubical parabola y = x 3 at the 
points with (a) x — 0.5 and (b) x = 1. Find 
the points of intersection of the tangent lines 
with the x and y axes. 

2.5.7. Find the general rule that enables 
determining the points of intersection with the 
x and y axes of the straight lines tangent to the 
curves (a) y = ax 2 and (b) y = bx 3 at point 
(*o» V (*o))- 

2.6 Increase and Decrease of Functions. 
Maxima and Minima 

Observing a curve that depicts the be- 
haviour of a function y = / (x) (say, 
the dependence of temperature T (t) 
on time t ), one can easily see the points 
at which the function grows , or increa- 
ses (say, point A in Figure 2.6.1), the 
points at which the function decreases 
(say, point B), the points of maximum, 
where growth of the function is re- 
placed with its decrease (point C ), and 
points of minimum , where decrease of 
the function is replaced with its growth 



Figure 2.6.1 


(point D ). But how must we define such 
concepts with sufficient mathematical 
rigor and how to determine, without 
looking at the graph, the behavior of a 
function at, say, point x — x 0 (drawing 
a graph always involves errors and 
therefore constitutes a nonreliable 
method)? 

Without a knowledge of derivatives, 
we have to seek the answer to these 
questions numerically, that is, we must 
take the temperature T at a given time 
t and then take it at some following 
time and see if it has increased or 
decreased. This is clearly not a reliab- 
le approach: even if T (£ x ) is greater 
than T (£), it still might be that at time 
t the temperature fell, then soon af- 
terward (after t but prior to t x ) it 
reached a minimum, and only then be- 
gan to grow and by t± had risen above 
T re- 
using derivatives we get an exact so- 
lution: we have to find the derivative 
dyidx. If dyidx = y' {x) is positive for 
a given x, then y (x) is an increasing 
function, that is, if x increases by a 
small quantity Ax, the value of y 
increases by a small amount Ay ~ 
y' {x) Ax (as was clarified earlier, 
the smaller the value of Ax the more 
exact the equation). We consider 
Ax > 0 (say, time x increases). If 
y'(x) >0, Ax >> 0, then, of course, 
Ay > 0, that is, the value of y in- 
creases with x (say, the temperature in- 
creases in time). The numerical value of 
the (positive) derivative shows how 
fast y (the temperature) rises: in the 
example with the temperature we find 
that if T'(t) — 10, then in the vicinity 
of t the temperature increases 10 times 
faster than time (for example, by 10 °C 
each second), while if T'(t) = 0.1, 
then the temperature increases 10 times 
slower than time (however, com- 
pare this with what was said in Section 
2.5 about the dimensions of the deriv- 
ative and its connection with the choice 
of units for the independent variable 
and the function). But if y'{x) < 0, 
Ax > 0, then Ay < 0; for instance, if 
T'(t) <0, the temperature T (t + 
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-f- A t) at t + At will be lower than T ( t ) 
at the given moment in time. Thus, 
a positive derivative indicates that the 
function is increasing and a negative der- 
ivative that the function is decreasing. 

The expressions “increasing function” 
and “decreasing function” are applied 
to any functions y (. x ) and not only to 
those that depend on time (functions 
of time); here the independent variable 
may be an arbitrary quantity (with or 
without dimensions). An increasing ( de- 
creasing) i function is one in which y 
increases as the independent variable 
x increases (decreases). The derivative 
dyldx is what indicates the rate of 
growth , that is, the ratio of the varia- 
tion in y to the corresponding (small) 
variation in x. A negative rate of growth 
means a decrease in y as x increases, 
and if dyldx < 0, then | dyldx | — 
— dyldx is the rate at which the func- 
tion decreases. 

The expression “the quantity y has 
a large negative derivative with respect 
to x ” means that y decreases rapidly as 
x increases, while a positive dyldx 
means that y increases with x. In this way 
the derivative of y points to the ten- 
dencies in the variation in y and enables 
us to know what to expect from further 
variations in the independent variable. 
This constitutes the main reason for 
studying the derivative of a function. 

Physicists and mathematicians, espe- 
cially those in the making (who have 
just learned what a derivative is), fre- 
quently put it to use in everyday life 
like this: “the derivative of my mood 
with respect to time is positive” in 
place of “my mood is definitely im- 
proving.” 

Solve this joke problem: what sign 
does the derivative of my mood have 
with respect to the distance from the 
dentist’s chair? My mood deteriorates, 
“decreases,” becomes “negative” as the 
distance decreases; hence, the deriva- 
tive is positive. 

The serious editor may complain 
about abuse of the English language, 
but actually this free-style use of 
mathematical concepts is good practice 


for further serious applications of 
mathematics. 

There are functions that have the 
same sign of the derivative for any val- 
ues of the variable: such is the property 
of the linear function y = kx + 6, 
since here the derivative dyldx = k 
is a constant. Later on we will see that 
in the case of the exponential function 
y = a x the derivative has a constant 
sign (although it is not constant in 
magnitude) for arbitrary x . However, 
a derivative need not have a constant 
sign; the sign of the derivative of a 
given function may be different for 
different values of the independent 
variable. 

Let us imagine a function y (x) whose 
derivative y'{x) is positive for x < x 0 
and negative for x > x 0 ; in short, 
y'(x) > 0, x < x 0 , and y’(x) < 0, 
x > x 0 . 

What can we say about such a func- 
tion? We begin with x < x 0 . As x 
increases to x 0 , the function y (x) will 
increase; however, as x continues to 
increase the derivative becomes nega- 
tive and y falls. The conclusion is that 
for x = x 0 the function y (x) has a 
maximum. 

Consider the contrary case: y' (x)<0 
for x < x 0 and y'(x) > 0 for 
x > x 0 . Reasoning as before, we con- 
clude that in this case y {x) has a 
minimum at x = x 0 . 

If a function y (x) is defined by a for- 
mula associated with a smooth curve, so 
that y'{x) also varies smoothly with x, 
then the different signs of y'{x) for 
x < x 0 and x > x 0 in both cases signify 
that the derivative vanishes at x = 
x 0 : 

V’ (* 0 ) = -^ sL = 0. (2.6.1) 

Thus, as already noted in Section 
2.5, by equating the derivative to zero we 
can find those values of the independent 
variable for which the function has a max- 
imum or a minimum . 21 Now we can re- 
s' 7 More precisely, the values at which 
the function may have (but also may not have) 
a maximum or a minimum. For example, the 
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fine the general theorem discussed in 
Section 2.5: at the points where a func- 
tion attains maxima the derivative 
changes its sign from plus to minus , 
while at the points where a function at- 
tains minima the derivative changes its 
sign from minus to plus . 

Here are some examples. First we 
turn to the function y = 3x 3 — x 2 — x 
discussed in Section 1.1 (see the table 
for this function in Section 1.1). Judg- 
ing by this table, one might think that 
the function increases for all values 
of x since every increase in x by uni- 
ty caused an increase in y. 

Let us calculate the derivative, 
however: 

y'(x) = 9x 2 — 2x — 1. 

Taking x = 0, we find that y'( 0) = 
— 1 < 0, which means that when 
x = 0, the function is decreasing. This 
refutes the supposition (obtained from 
a glance at the table) that the function 
is an everywhere increasing function. 

We equate y'(x) to zero. Solving the 
equation y'(x) = 9x 2 — 2x — 1=0 
yields two roots: x 1 ~ —0.24 and 
x 2 ~ 0.46. 

Now we form a detailed table inclu- 
ding the maximum and minimum 
points just found: 


x —3 —2 —1 —0.3 —0.24 —0.18 

y —87 —26 —3 0.129 0.140 0.131 


x 0 0.40 0.46 0.52 1 2 

y 0 —0.372 —0.381 —0.370 1 18 


We see that, true enough, on the por- 
tion from x —0.24 to x = 0.46 the 
function y falls from +0.14 to —0.38. 
A comparison of the value y (—0.24) 
with the adjacent values y (—0.30) 


function y = x z has a derivative, y' = 3a; 2 , 
which vanishes at x = 0; however, at this 
point the function (see its graph in Figure 
1.5.1a) has neither a maximum nor a min- 
imum. (Other exceptions to our rule that 
refer to curves that are not smooth are dis- 
cussed in Section 7.2.) 



Figure 2.6.2 

and y ( — 0.18) confirms the fact that 
when x ~ — 0.24, y reaches a (local) max- 
imum, the adjacent values of y being 
smaller. The graph of the function y = 
3x 3 — x 2 — x is given in Figure 
2 . 6 . 2 . 

This example shows once more that 
the word “maximum” should not be un- 
derstood as meaning the largest of all 
possible values of y. Indeed, at the 
maximum point y (—0.24) ~ 0.14, 
while y = 1 for x = 1, y = +18 for 
x = 2, y = 269 for x — 10, and so on, 
y rapidly increasing without bound as x 
increases without bound. In what way 
does the maximum point x max ^ — 0.24, 
i/max — 0.14 that we found differ 
from all other points of the curve? 

The difference is that for close-lying 
values of x , both larger than x mSLX , and 
less than x max the value of y is less 
than y mRX {=y (x m&x )) . This peculiarity of 
Xmax is clearly seen in the table (e.g. 
compare y ( — 0.30), y (—0.24), and 
y (—0.18)). The same arguments can be 
applied to the minimum x mln ~ 0.46, 
Umm — — -0.381: for large (in absolute 
value) negative x’s the value of y de- 
creases without bound and becomes less 
than y mn (and, in general, y can be 
made less than any negative number), 
but x min differs in that the value z/ min 
(= ( y (^mm)) is less than the values of y 
for x close to x mn . The condition of a 
vanishing derivative enables finding 
just such {local) maxima and minima. 

For the second example we take the 
function y = x 3 — x discussed in Sec- 



88 


2 What is a Derivative? 


tion 1.5. The derivative of this func- 
tion is y' = 3x 2 — 1, whereby y'(x) = 0 
at x — ± l/)/"3. Let us see how 
the sign of the function y'(x) — 3x 2 — 1 
varies in the vicinity of points 
x = 1/]/ 3 ~ 0.577 and x — — 1/]/ 3 ~ 
— 0.577. We will readily find that 
at x — — l/y 3 the function y (^)hasa 
(local) maximum and at x=lI]/3 the 
function y (, x ) has a (local) minimum 
(see Figure 1.5.3). 

Now we can give a complete explana- 
tion why the graphs of the functions 
y = x 3 + x and y = x 3 — x (Figures 
1.5.2 and 1.5.3) differ. Let us consider 
the general equation 

y = x 3 + cx , (2.6.2) 

where c is an arbitrary number. In 
this case y' = 3x 2 4 - c. It is clear that 
for c > 0 the derivative y is positive 
everywhere, which means that y in- 
creases for all values of a;, that is, there 
are neither maxima nor minima. On 
the other hand, for c < 0 the deriva- 
tive y' vanishes at x = ±Y — c/3 = 
db x 0 , and in the interval from 
x = — oo to x — — x 0 the derivative 
y' is positive (the function is increas- 
ing), for — x 0 < x < x 0 the derivative 
is negative (the function is decreas- 
ing), and for x 0 < x <C oo the deriva- 
tive is again positive (the function is 
again increasing); here the symbols 
— oo and oo denote, as usual, very large 
negative and positive numbers. Thus, 
at x = — x 0 the function has a maxi- 
mum and at x = x 0 it has a minimum. 
If we make c smaller and smaller (in 
absolute value), the maximum and min- 
imum approach each other: at c = 0 
the points dzx 0 = ± V — ^/3 merge 
into a single “critical” point x = 0, 
the point where the derivative of y 
vanishes. In this case (for the curve 
y = x 3 ) the critical point is neither a 
maximum nor a minimum, since if we 
consider “neighboring” curves cor- 
responding to negative values of c 
that are small in absolute value, we 
might decide that our point is simul- 
taneously a maximum and a minimum, 
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which is clearly impossible, and it is 
clearly unjustified to expect that the 
point is a maximum (and not a mini- 
mum) or a minimum (and not a maxi- 
mum). All these peculiarities of the 
curves specified by Eq. (2.6.2) are 
shown in Figure 2.6.3. We see that there 
are two distinct cases: c < 0 and 
c > 0; the “intermediate” case corre- 
sponds to the “limit” curve y = # 3 , 
for which c = 0. 

Let us return to Exercise 2.2.1 devoted to 
the specific heat capacity of diamond. There 
we saw that there exists an empirical relation- 
ship between the temperature T and the quan- 
tity of heat Q = Q (T) (in joules) required to 
heat 1 kg of diamond from 0 to T °C: 

Q (T) = 0.3965 J + 2.081 X 10~ 3 r 2 

— 5.024 X io- 7 r 3 , 

which implies that the specific heat capacity 
of diamond, c — c (T) (in J/kg*°C), is ex- 
pressed by the formula 

c (T) = 0.3965 + 4.162 X 10" 3 r 

— 15.072 X io- 7 r 2 

(see the solution to Exercise 2.2.1). To analyze 
the behavior of the specific heat capacity of 
diamond in the temperature interval in which 
the empirical formula is valid, we need the 
expression for the derivative of c : 

c'(T) = 4.162 X 10- 3 — 30.144 X 10 “ 7 T. 

It is clear that c' (T) is positive for T < 
(4.162/30.144) X 10 4 = T 0 ^ 1380 °G, is 
zero at T = T 0l and is negative for T > T 0 . 

But, as noted earlier, the above empirical 
formula is valid only up to 800 °C. All the 
above formulas imply that c' (T) vanishes at 
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T ~ 1380 °C, that c ( T ) vanishes at T ~ 
2850 °C, and that the quantity of heat Q ( T ) 
vanishes at T ~ 4320 °C. Of course, all these 
results are meaningless and prove that empiri- 
cal formulas cannot be employed outside their 
range of application. 

Determining maxima and minima 
arithmetically (by computing and com- 
paring the values of the function for 
different values of the independent var- 
iable) is many times more arduous and 
less exact. However, if you employ a 
pocket calculator, then an “arithmetic 
experiment” enables finding a maximum 
or a minimum by meaningfully compar- 
ing the values of the function at differ- 
ent points (a calculator must be used, 
of course, in a proper manner), and our 
problem is considerably simplified. But 
even in this case the use of derivatives 
makes the calculations more transparent 
and simple. Higher mathematics is not 
only a remarkable achievement of the 
mind. Practical computational prob- 
lems are resolved much more easily by 
the methods of higher mathematics. 

Exercises 

2.6.1. Find the values of x for which the 

following functions have a maximum or a 
minimum. In each case determine whether the 
minimum or maximum is involved. For func- 
tions involving constants give the answer for 
various values of these constants (in particu- 
lar, with different signs): (a) y = ax 2 , (b) y = 
x -f- 1 lx, (c) y = ax + b/x, (d) y = x 3 — 
3x + 100, (e) y = x s + px 2 + qx + r, 

(f) y = x* + ax 2 + 6, and (g) y — ax 2 + b/x 2 . 

2.6.2. The temperature dependence of the 
volume of 1 g (or cm 3 ) of water is given by the 
following empirical formula: v (t) = 1 + 
8.38 X 10" 6 ( t — 4) 2 . At what temperature 
will the volume be minimal? 

2.6.3. Solve Exercise 2.6.2 with the follow- 
ing refinements (suggested by various scien- 
tists) of the above empirical formula: 

(a) v (t) = 1 — 61.045 X 10" 6 * + 77.183 X 
10-7*2 _ 37 34 x io-9*3 ? an( j (b) v (t) = 1 — 
57.577 X 10- 6 * + 75.601 X 10" 7 * 2 — 35.07 X 
X10“ 9 * 3 . 

2.7 The Second Derivative 
of a Function. Convexity and 
Concavity of a Curve. 

Points of Inflection 

Let us once more study the behavior of 
a function near its “critical” points 
(i.e. points at which it might have a 


maximum or a minimum) x = x 0 of the 
curve y — y (x) representing the func- 
tion, or points at which 

y'(*o) = 0. (2.7.1) 

How to distinguish a maximum from a 
minimum if condition (2.7.1) is sati- 
sfied? We know that condition (2.7.1) 
holds true both for (local) maxima and 
(local) minima, the difference being in 
the sign of y'(x) for x < x 0 and for 
x > x 0 . 

But how is it possible to determine 
the sign of y'(x) for x close to x 0 with- 
out computing y' directly for other val- 
ues of x? Let us start with the case of a 
maximum of the function y {x), when 
y'(x) >0 for x < x 0 and y'(x) < 0 
for x > x 0 . We see that here the deri- 
vative y'{x) is itself a decreasing func- 
tion: as x increases, the derivative, 

which was at first positive (for x < x 0 ), 
vanishes (when x = x 0 ) and, continuing 
to fall, becomes negative when x > 
x 0 . But we already know how to dis- 
tinguish a decreasing function from an 
increasing function: its derivative is 
negative. Hence, at a value x = x a 
at which y has a maximum the deriva- 
tive y' (x 0 ) vanishes and the derivative 
of the derivative is negative . This quan- 
tity, the derivative of a derivative* 
which by the ordinary rules can be writ- 
ten as a “double-decker” fraction, 

dy’ d (lte) 

dx dx ’ 

is called the second derivative . It ha& 
the notation y"{x) or d 2 y!dx 2 and is 
read “d two y over d x squared.” 

It is clear that if z = z ( t ) is a func- 
tion that specifies the dependence of 
distance on time (see Section 2.1), 
then z f (t) = v is the velocity of the mo- 
tion and z"(t) = dvldt is the rate of 
change of velocity, or the acceleration 
(see also Chapter 9). If y has the dimen- 
sions of distance (cm) and x is time 
(s), d^yldx 2 has the dimensions of accel- 
eration (cm/s 2 ); if y is measured in units. 
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of e 1 and x in units of e 2 , the derivative 
<l 2 y!dx 2 has the dimensions of e x /e\, as 
indicated by the notation d 2 y/(dx) 2 . 
It is advisable to bear in mind the di- 
mensions of the second derivative when 
dealing with physical problems that in- 
volve y"{x). 

It is the meaning of the second deriva- 
tive, acceleration, in our distance-time 
problem that makes the idea of a sec- 
ond derivative so important to physics, 
.since by Newton’s second law the accel- 
eration is the basic characteristic of 
motion. In Chapter 9 we discuss all 
these matters in greater detail. 

Let us revert to the examples given 
above. If y (x) = 3x 3 — x 2 — x, then 
y'{x) = 9a; 2 — 2x — 1 and, hence, 

y”(x) = (y'(x))' = 18* - 2. 

At x ~ — 0.24 we have y' = 0 and 
y" ~ — 15 ."3 and, indeed, point 

x ~ —0.24, y ~ 0.14 is a maximum of 
y {x). At x ~ 0.46 we have y' = 0 and 
y" ~ 6.3 > 0, that is, point x ~ 0.46, 
y ~ — 0.38 is a minimum of the func- 
tion. 

Similarly, if y = x 3 — x , then 
y' = 3x 2 — 1 and y" = 6*. Therefore, 
at x = — 1/V^ 3 ~ — 0.577 the second 
derivative y" is negative, and at x = 
+ 1//3 it is positive: at the first 
point the function has a maximum and 
at the second a minimum. (This result 
could also be predicted geometrically, 
.since at x = — 1 and at x = 0 the 
function y = x 3 — x — — (x — x 3 ) 
vanishes and at the intermediate points 
it is positive, so that at x ~ — 0.577 
there can be only a maximum and not a 
minimum.) In exactly the same way, 
in the example with the specific heat 
capacity of diamond (see Section 2.6) 
the second derivative c "( T) ca — 3.0144 X 
10 -6 is always negative, which would 
mean that the curve representing 
c ( T ) has a maximum if our formulas 
could be applied at T 0 where c'{T 0 ) =0 
{actually the specific heat capaci- 
ty of diamond, just as that of other sol- 
ids, increases with T and tend to a 
-constant value at high temperatures). 


To summarize, then, if at a certain val- 
ue of x the function y (x) is such that 

y'(x) =0, y"(x)< 0, (2.7.2a) 

then at this point the function has a 
maximum , and if at a certain value of 
x the function y (x) is such that 

y\x) = 0, y”(x)> 0, (2.7.2b) 

then at this point the function has 
a minimum. (The reader is advised 
to return to the exercises accompanying 
Section 2.6 and apply the second-deri- 
vative criterion, which enables distin- 
guishing between a maximum and a 
minimum.) 

We see that the sign of the derivative 
of a function y (, x ) has a simple geo- 
metrical meaning: the fact that the de- 
rivative is positive means that the func- 
tion is increasing , that is, its graph is 
Hirodreh upward ’fi" we nrove droirg J iife ~x 
axis from left to right, while if y'(x) <0, 
the function is decreasing , that is, 
its graph is directed downward. The 
sign of the second derivative y"{x) 
also has a simple geometrical meaning. 
If the second derivative of a function is 
positive, or y"{x) > 0, this means that 
the first derivative y'(x) is increasing 
or, in other words, the angle a (where 
tan a = k = y'(x)) formed by the tan- 
gent line to the curve and the x axis 
increases. But in this case, as we move 
from left to right along the x axis, the 
tangent line rotates counterclockwise 
(point A in Figure 2.7.1). And this, ob- 
viously, means that our curve is convex 
downward (cf. Section 1.4); sometimes it 
is said that the curve is concave upward. 
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Figure 2.7.2 


Similarly, the fact that y"{x)<zO 
means that the first derivative y'(x) of 
the function y = f(x ) decreases, and 
so does the angle a. Thus, at the points 
where y"(x) < 0, the tangent line ro- 
tates clockwise as we move from left to 
right along the x axis, that is, the curve 
is covvex upward , or concave downward 
{point C in Figure 2.7.1). The points at 
which y"{x) = 0 can be generally char- 
c ei Txdu. v&'thwcv* seceriAi 

derivative changes its sign, that is, 
points at which concavity is replaced 
with convexity or the other way round. 2 - 8 
At such points the tangent to the 
graph of the function goes from one side 
of the curve to the other, that is, the 
tangent intersects the curve. Points at 
which the tangent intersects the curve 


2 - 8 The rare instances of the type of the 
origin for the curve y = x 4 (see Figure 1.5.16; 

here y' = 4a: 3 , y" = 4 X 3a; 2 = 12a: 2 , so that 
y" = 0 at x = 0, while the curve does not 
change the direction of convexity at this point) 

refer exclusively to the points at which the 

first, second, and third derivatives vanish 
simultaneously (or the second derivative 

has a salient point or a discontinuity). (The 
third derivative is the derivative of the second 

derivative and is written as y"' or d?y/dx 3 

and has the dimensions of eje^ where e 1 and 
e 2 are the units of measurement of y and x. 

Below we will encounter third and higher-order 
derivatives.) 


(point B in Figure 2.7.1) are known as 
points of inflection . 2 9 

For instance, when the graph of a func- 
tion y = f (x) is intersected by a hori- 
zontal tangent at point x = x 0 (Figure 
2.7.2), this point is neither a maximum 
nor a minimum of the function, while 
the derivative at the point is zero (the 
tangent line is horizontal). Thus, we see 
that the cases where condition (2.7.1) 
does not work, that is, the point where 
the derivative vanishes is neither a max- 
imum nor a minimum, are due to the 
fact that in addition to (2.7.1) the fol- 
lowing condition holds true at this 
point: 

y" (x 0 ) = 0, (2.7.3) 

which characterizes a point of inflec- 
tion. (On the other hand, compare what 
we have just said with the footnote 
2.8, where the function y v = x 4 satisfies 
both conditions, (2.7.1) and (2.7.3), 
at point x = 0 and nevertheless has a 
maximum at this point.) 

In Section 7.4 we will once more en- 
counter the concept of convexity of a 
function. 

Exercises 

2.7.1. Find the second derivatives of the 
following functions: y = x 2 , y — x 3 , y — x 4 , 
and y — ax 2 + bx + c. 

2.7.2. Find the acceleration of a point mo- 
ving along a straight line according to the 
following law: z = at 2 + bt + c. What can be 
said about the acceleration? 

2.7.3. Specify the ranges of convexity and 
concavity for the following functions: (a) y = 
x s , (b) y = x 3 + px 2 + qx + c, and (c) y = 
x + i/x. 

2 - 9 Note that at a point of inflection the 
tangent intersects the curve smoothly, 
touching it without a break, contrary to the 
case where the y axis intersects the parabola 
y = * 2 . 



Chapter 3 What is an Integral? 


3.1 Determining Distance 
from the Rate of Motion. 

The Area Bounded by a Curve 

The problem of determining the instan- 
taneous rate of motion, or velocity, 
v (i t ) from a given dependence of the 
position of a body upon the time, 
z = z ( t ), led us to the concept of the 
derivative: 


The inverse problem consists in deter- 
mining the position of a body, z — 
z ( t ), that is, the distance covered by 
a body in a given interval of time £, 
when we know the instantaneous veloci- 
ty as a function of the time, v = v ( t ). 
This problem brings us to the second 
most important concept of higher 
mathematics, that of the integral. 

Let us agree on some convenient no- 
tation. We consider the distance trav- 
eled during time from t x to t 2 . So as to 
avoid subscripts, let us denote the be- 
ginning of the time interval by a 
and the end of the interval by 6, so 
that t x = a and t 2 = b. We denote the 
distance covered from time a to time 
b by z (a, b ). Remember that when the 
two quantities, a and 6, stand under 
the function symbol z in parentheses, 
then z (a, b) is the distance covered dur- 
ing the interval of time from t = a 
to t = 6, whereas z ( t ) with the single 
quantity t in parentheses is the posi- 
tion (coordinate) of the body at a speci- 
fied time t. These quantities are related 
in a simple manner: 

z (a, b) = z (b) — z (a), or 

z(b) = z (a) + z (a, b) (3.1.1) 

(here we assume that b > a). We see 
that the distance covered during time 
from a to b is equal to the difference 
between the coordinate at the end of 
the time interval under consideration, 


z (6), and that at the beginning of the 
interval, z (a). 3 * 1 

Now let us compute z (a, b). In the 
most elementary case, when the velocity 
is constant, or 

v ( t ) — constant = y 0 , (3.1.2) 

the distance covered will obviously 
be equal simply to the product of the 
time of motion into the velocity: 

2 (a, b) = (b — a) u 0 . (3.1.3) 

Taking advantage of the graph of ve- 
locity versus time, we find that a con- 
stant velocity is associated with a hor- 
izontal straight line (Figure 3.1.1). 
The distance covered is clearly equal 
to the hatched area because the area of 
a rectangle is equal to the product of the 
base (b — a) by the altitude (v 0 ). What 
do we do in the general case where the 
instantaneous velocity is not a constant 
quantity? 

Let us make a detailed study of a nu- 
merical example. Suppose the velocity 
is given by the formula 3 * 2 v = t 2 . We 
seek the distance covered during the 
time interval from t = a = 1 to t = 
b = 2. 

We partition the entire interval from 
a to b into ten subintervals and set up 
the table of velocity: 


t 

V 

1.0 1.1 1.2 

1.0 1.21 1.44 

1.3 

1.69 

1.4 

1.96 

1.5 

2.25 


t 

V 

1.6 1.7 

2.56 2.89 

1.8 

3.24 

1.9 

3.61 

2.0 

4.0 


Why is it difficult to compute the 
distance at a velocity v ( t ) given by a 

3 - 1 For the sake of simplicity we consider 
only motion in one direction in such pictorial 
considerations, although possibly, the veloci- 
ty varies in time. 

3 - 2 If the velocity v is expressed in units of 
cm/s and time t in units of s, then, in order 
to maintain the requirements of dimensionali- 
ty, we write v = kt 2 , where the factor k has 
the dimensions of cm/s 3 . Here we assume that 
k = 1 cm/s 3 . 
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Figure 3.1.1 

formula? Clearly because the velocity 
is variable (for a constant velocity, the 
answer is trivial). In the case at hand, 
the velocity changes four times over 
the time interval from t = 1 to t = 2. 
However, after this interval is parti- 
tioned into ten parts (subintervals), 
the velocity varies less over each subin- 
terval of duration 0.1 second (only 10 
to 20 percent). Therefore, in the subin- 
tervals the velocity can roughly be tak- 
en to be constant, and we can compute 
the distance covered during such a sub- 
interval of time as the product of 
that subinterval by the velocity. In 
what follows, we denote the subinter- 
vals into which the interval from t = a 
to t = b is divided by A t (the time 
increment); for each such subinterval 
At we will assume the distance to be 
simply proportional to A t, with the 
velocity assumed constant over At. 

To compute the distance covered dur- 
ing each subinterval At, equal to 0.1 s, 
we utilize the initial velocity in the giv- 
en subinterval: 1 cm/s in At from 1 to 
1.1 s, 1.21 cm/s in At from 1.1 to 1.2 s, 
and so on; finally, 3.61 cm/s in the last 
subinterval At extending from 1.9 to 
2.0 s. The total distance covered during 
the time interval from t = 1 to t — 2 
is then computed to be 

z (1, 2) ~ 0.1 + 0.121 + 0.144 
+ . . . + 0.361 - 2.185 cm. 

(3.1.4) 

It is quite clear that we have reduced 
the actual distance covered because the 
velocity here increases with time and 
so the velocity at the beginning of each 
subinterval is less than the average ve- 
locity. Each of the ten terms into which 
the entire distance was partitioned is 


slightly less than the actual value , 
and so the result has a deficit, too. 

Let us now compute the distance 
somewhat differently, namely, in each 
subinterval At we will take the value of 
velocity at the end of the subinterval. 
For the first subinterval At from 1 to 
1.1 this velocity is equal to 1.21 cm/s, 
for the last one from 1.9 to 2.0 s it is 
4 cm/s. We then get the following esti- 
mate for the entire distance traveled: 

z (1, 2) ~ 0.121 + 0.144 
+ . . . + 0.400 = 2.485 cm. (3.1.5) 

This calculation clearly yields the 
distance z (1, 2) with an excess. 

Hence, the true value lies between 
2.185 and 2.485 cm. The difference 
between these numbers amounts to about 
15 % . Rounding off the boundary val- 
ues for z yields 2.18 < z (1,2) < 2.49. 

These computations can be illustrat- 
ed by means of a graph. We construct 
the graph (Figure 3.1.2) with time laid 
off on the axis of abscissas and velocity 
on the axis of ordinates. In the figure 
we divide the time interval into five in- 
tervals instead of ten (as we did in the 
table), which makes each part (step) 
stand out more clearly. Each term in 
the first sum (3.1.4) is the area of a nar- 
row rectangle with the corresponding 
subinterval At as the base and the ve- 
locity at the beginning of the subinter- 
val as the altitude. Thus, the sum is 
the area under the polygonal (step-like) 
line (the hatched area in Figure 3.1. 2a). 
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Figure 3.1.3 


The second sum (3.1.5), in which the 
velocity in each subinterval is taken at 
the end of the subinterval, corresponds 
to the hatched area in Figure 3.1.26. 

How can we make a more accurate 
computation of the distance covered 
during a given time (from t = a = Is 
to t = b = 2 s in our example)? Clear- 
ly, the distance between the lower and 
upper estimates, that is, the difference 
between 2.18 and 2.49 in our case, de- 
pends on the variation of velocity with- 
in the limits of each subinterval A t. 
For this difference to become smaller, 
we must simply see to it that the veloci- 
ty varies less rapidly within each subin- 
terval; but since the velocity is a quan- 
tity over which we have no power, we 
have to partition the time interval in- 
to smaller subintervals. This approach 
suits us perfectly since it brings us clos- 
er to our goal: if all the subintervals 
are made smaller (while the total num- 
ber of the subintervals increases pro- 
portionately), the two estimates for 
the distance z (a, b) obtained through 
the method discussed above move 
closer to each other. 

For instance, if we split up the l-to-2-s 
interval into 20 subintervals (instead 
of 10) of 0.05 s duration each, using the 
method discussed above and the initial 


velocities in each At, we compute the 
distance to be 

2 (1, 2) ~ 0.05 + 0.05 x 1.1025 
+ . . . + 0.05 x 3.8025 = 2.25875. 

(3.1.4a) 

Using the terminal velocities in each 
subinterval, we get the distance 

2 (1,2) ~ 0.05 X 1.1025 + 0.05 
X 1.21 + ... + 0.05 X 4 = 2.40875. 

(3.1.5a) 

The difference between 2.25875 and 
2.40875 now amounts to about 7%. 
The range within which z (1, 2) lies 
has narrowed down. Rounding off these 
figures, we get 2.26 < z (1, 2) < 2.41. 

As we reduce the subintervals At, 
the result approaches the true value of 
the distance covered. It will be comput- 
ed later on and proves to be equal to 

z (1, 2) = 2~ ~ 2.333. As we reduce 

At, the difference between the initial 
and terminal velocities in each subin- 
terval At decreases, and so also does the 
relative error in each summand. It is 
because we are dealing here with the rel- 
ative error we can be sure that although 
the number of terms (that is, the num- 


11 Determining Distance from the Rate of Motion 


95 



ber of errors) increases, nevertheless the 
precision with which we estimate the 
entire sum of the distances, that is, 
the precision with which we estimate 
z (1, 2) is sure to increase. 

To illustrate the above reasoning, 
in Figure 3.1.3 we depict the construc- 
tions connected with the estimate of 
z (1 , 2) from the given velocity v = t 2 
at A t = 0.5, 0.2, and 0.1 s. Geometri- 
cally, it is obvious that as we increase 
the number of subintervals A t and reduce 
the duration of each one, the dimen- 
sions of each step in Figure 3.1.3 become 
smaller, whereby the step-line comes 
closer and closer to the curve v versus t . 

We thus conclude that the distance 
covered during time from t = a to 
t = b, given an arbitrary dependence 
of the instantaneous velocity on time, 
v = v (£), is equal to the area bounded 
by the curve v = v ( t ), the vertical lines 
t — a and t = 6, and the t axis (Fig- 
ure 3.1.4). 

This conclusion yields a method for 
practical computation of the distance: 
we can construct the graph on graph pa- 
per and determine the hatched area ei- 
ther by counting the squares, or, for 
instance, by cutting out the area of pa- 
per, weighing the sheet, and comparing 
its weight with the weight of a rectangu- 
lar or square piece of the same paper of 
known area. 3 * 3 


3 * 3 A pocket calculator, however, makes 
the method of constructing sums of the type 
(3.1.4) and (3.1.5) (or (3.1.4a) and (3.1.5a)) 
much easier to carry out than in the case of 
primitive weighing of paper or cardboard fig- 
ures. 


Such methods are convenient and quite* 
justified when the velocity is not known ex- 
actly, that is, is given in the form of a table or 
graph obtained empirically (in an experi- 
ment) rather than by a formula. We will not 
dwell on these approximate methods and will, 
attempt to express the distance by a formula 
when the velocity is given by a formula. Note* 
also that in some cases when, say, the velocity 
is expressed by the formula v ( t ) = (1 — z) -1 , 
with t varying from 0 to 1 s, and increases 
without limit as t -> 1 s, the concept of dis- 
tance becomes meaningless (the distance tends 
to infinity). For the time being we will not 
consider such cases (but see Section 5.2). 

We can also make more precise the numer- 
ical method used above to determine the dis- 
tance from the velocity. We will assume that 
in each subinterval At the velocity is con- 
stant and equal to the arithmetic mean (half- 
sum) of the initial and terminal velocities in 
the given subinterval. In this approach, with 
a partition into ten subintervals, the velocity 
in the first subinterval from 1 to 1.1 s is 
taken equal to (1 + 1.21)/2 = 1.105 cm/s 
(see the table at the beginning of this section) 
and the distance covered during this subin- 
terval of time is 0.1105 cm, the distance cov- 
ered during the second subinterval is 
0.1 (1.21 + 1.44)/2 = 0.1325 cm, and so on. 
Adding all the distances obtained in this 
manner, we get the distance covered during 
the time interval from a — 1 s to b = 2 s: 
2 (1, 2) = 0.1105 + 0.1325 + ... 

. . . = 2.335 cm. If we divide the interval into- 
20 subintervals, we get (using the same half- 
sum of velocities) z (1, 2) = 2.33375 cm. 
These values are much closer to the true value 
2.33333 cm than those computed on the basis of 
the initial and terminal values of velocity for 
the same number of subintervals: for ten sub- 
intervals, the error is equal to 0.07% instead 
of 15% in the earlier method, and for 20 sub- 
intervals the error is only 0.02% instead 
of 7%. 

The new method for finding the distance can 
also be displayed vividly on a graph. The pro- 
duct of the half-sum of the velocities at the 
beginning and end of an interval by the mag- 
nitude of the interval of time is equal to the 
trapezoid ABCD (Figure 3.1.5). With bases 
AB and DC and altitude AD , the area is 

ab±dc_ ad= v_ 

For this reason, determining the distance on 
the basis of the half-sum of the velocities is 
known as the trapezoid method . For the shape 
of the curve shown in Figure 3.1.5, the area 
of the trapezoid is somewhat greater than the 
area bounded by the straight lines BA , AD , 
DC and the portion BC of the curve u = v (t). 
The difference between the area of the trape- 
zoid and the area bounded by the arc of the 
curve is equal to the area of the crescent-like 
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Figure 3.1.5 

figure formed by the chord BC and the por- 
tion BC of the curve shown hatched in Fig- 
ure 3.1.5). The sum of all such areas yields the 
error, that is, the difference between the true 
value of the distance and the value computed 
by the trapezoid method. A comparison with 
Figures 3.1.2 and 3.1.3 shows vividlv that the 
error in the trapezoid method should be less 
than that when formulas (3.1.4) and (3.1.5) 
are employed (the step method or, as mathe- 
maticians call it, the rectangular method. 

Of course, when one compares the 
distance with the area under the graph 
representing the v versus t dependence, 
it is important to take into account the 
scale used (compare to what was said 
in this connection in Section 1.7). 
Suppose that 1 cm along the axis of ab- 
scissas on the graph corresponds to a 
time interval of T seconds and 1 cm on 
the axis of ordinates corresponds to a 
velocity of V cm/s. Then, if the motion 
is at a constant velocity v 0 during a 
time from t = a to t = 6, the distance 
covered is equal to v 0 (b — a), and 
the area of the (hatched) rectangle on 
the graph (Figure 3.1.1) is equal to 

S=(v 0 /V) ((b — a) IT) cm 2 . 

Thus, here we have z (a, b) = SVT. 

This relationship between the distance 
covered and the area on the graph of 
velocity bounded by the curve v (t), 
the axis of abscissas, and two vertical 
lines is preserved in the case of a 
variable velocity v, that is, in the case 
of an arbitrary function v = v (t). 

3.2 The Definite Integral 

In the preceding section, two problems — 
finding the distance traversed by a 
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Figure 3.2.1 

body and the equivalent problem of 
finding the area under a curve — led to 
a consideration of sums of a special 
type with a large number of small 
terms (summands). A rigorous approach 
to such problems leads to the concept of 
the integral. 

The distance z = z (a, b) found from 
a given velocity v = v (t) is called the 
definite integral of the function v (t) 
(velocity) with respect to the variable 
t (time) taken from a to b (the same 
quantity is sometimes called the integral 
of function v ( t )). 

We now give a mathematical defini- 
tion of the integral that corresponds to 
the ideas illustrated by the numerical 
example of Section 3.1. The definition 
will remain valid when we consider 
physical or mathematical quantities of 
a nature different from velocity and 
distance. 

Suppose that we have a function 
v = v ( t ). To find its integral from a 
to b we partition the interval into a large 
number of subintervals, n. We denote 
the values of the independent variable 
t at the endpoints of the subintervals 
by *o> t\, h, • *n-i > where, obvious- 
ly? t 0 = a and t n = b (Figure 3.2.1). 
The lengths A£ of the small subinter- 
vals of time are equal to the difference 
between adjacent values of t. 3A Thus, 
for an arbitrary * (with l = 1,2, . . ., n) 

A* z — h — h- 1* 

The subscripts on the quantities t and 
A t are merely number labels, or in- 
dices, as they are sometimes called 
(see footnote 1.1). 


3 - 4 If the interval from a to b is specialh 
partitioned into n equal parts, then eacl 
subinterval At equals (b — a)/n. Itisnotobli 
gatory, however, that the subintervals b( 
equal; the only thing we require is that eacl 
subinterval be small. The reader will se( 
the truth of this if he or she thinks througl 
the distance-velocity example of Section 3.1 
see also Exercise 3.2.3. 
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The approximate value of the integ- 
ral z (< a , b ) is given by the formula 

l—n 

z(a, b)~ A t t . (3.2.1) 

1= 1 

1=71 

2 signifies that the expression standing 

z=i 

to the right of the summation symbol 3 • 5 
be taken for all values of l from 1 to w 
and then all these expressions are to be 
added together. For example, if n — 

1 = 10 

10, then 2 v (*z_ i) — v (t 0 ) At x + 

z=i 

y(^) A^ 2 + . . . +y (t 9 )At l0 . In the example 

of Section 3.1 (see the table in that 
section), t 0 = 1, t x = 1.1, t 2 = 1.2, 

1 = 10 

. . ., and z (1, 2) ~ 2 = 2.185. 

i=i 

In the approximate expression 
(3.2.1), the value of the function v ( t ) 
in each subinterval was taken at the 
beginning of the subinterval, at the 
point A different approximate 

expression is obtained if we take the 
value of the function at the endpoint 
of each subinterval: 

1=71 

z (a, b) ~ 2 v (t t ) (3.2.2) 

i=i 

In the example of Section 3.1, this sum 
for n = 10 was equal to 2.485. 

The definite integral of a function 
v ( t ) taken from a to b is the limit ap- 
proached by the sums (3.2.1) and (3.2.2) 
as all subintervals A t x tend to zero. 
The integral is written 

b 

z(a, b)= j v(t)dt (3.2.3) 

a 

(read: z (a, b) equals the definite inte- 
gral of v (t) from a to 6, dt). The inte- 
gral sign ^ is merely an elongated S 

(the first letter of the word summa , 
which is the Latin for sum). 

3 * 5 .The symbol 2 is the capital Greek letter 
sigma. It corresponds to S in the Latin alpha- 
bet, the first letter of the term sum. 


Unlike Af, the symbol dt signifies 
that in order to obtain the exact value 
of the integral it is necessary to pass to 
the limit as all subintervals At tend to 
zero (just as the derivative dz/dt is 
obtained from the ratio Az/Af if we send 
Az and At to zero and pass to the limit). 
The formulas (3.2.1) and (3.2.2), in 
which the At are small but finite, only 
yield approximate values of the integ- 
ral, just as the ratio Az/At only yields 
an approximate value for dz/dt for finite 
Az and At. 

When the subintervals At become 
smaller and smaller, it is immaterial 
whether we take the value of the func- 
tion v (t) at the beginning, at the end, 
or in the middle of the subintervals, 
which is to say that it is immaterial 
whether we proceed from (3.2.1) or 
from (3.2.2) or from some other expres- 
sion for the “integral sum” similar to 
the one in the right-hand sides of 
(3.2.1) and (3.2.2) (say, we could have 
added the terms v (t m ) A£, where t m 
is the middle of the corresponding sub- 
interval M). For this reason formula 
(3.2.3) simply has v (£), which is a val- 
ue of the function in the subinterval 
At without any indication of the value 
of t being taken inside (or at the begin- 
ning or end) of the subinterval. 

Another way in which the integral 
(3.2.3) differs from the sums (3.2.1) 
and (3.2.2), which yield approximate 
values of the integral, lies in the fact 
that as the At becomes smaller and smal- 
ler and the number of subintervals in- 
creases, we no longer label them. For 
this reason, we indicate on the integral 
only the limits of variation of t (range 
of t) from a to 6. The quantity a is 
placed at the bottom of the integral sign 
and is termed the lower limit of integra- 
tion, while b is placed at the top and is 
called the upper limit of integration. 3 * 6 


3 * 6 In this section the word “limit” is used 
in two meanings. First, the integral is the 
limit of a sum in the same sense that the 
derivative is the limit of a ratio. Here, limit 
corresponds to the sign lim. Second, we speak 
of the limits of variation of t from a to b, 
the limits of integration a and b. The meaning 


7-0946 



98 


3 What is an Integral? 


The range of t from a to & is called the 
domain of integration. The function 
v (£) in the expression of the integral 
is called the integrand , t being the 
variable of integration. 

Thus, the integral is defined as the 
limit approached by the sum of pro- 
ducts of the values of the function multi- 
plied by the difference of the values of 
the independent variable when all 
differences of the independent variable 
tend to zero: 3 * 7 


l—n 

lim 2 v ( t i )Afj 

A ‘i~° i=i 

l—n b 

= lim y\ v At t = \v(t)dt . (3.2.4) 

A*/-0 J A 

1 1=1 a 


Although the first and second sums in 
(3.2.4) are different for a finite number 
l of small subintervals At h their limits 
coincide when all subintervals A t de- 
crease without limit (tend to zero). 
These limits yield the value of the in- 
tegral. 

As At tends to zero, each separate 
summand tends to zero, but on the oth- 
er hand the number of terms in the sum 
increases and approaches infinity. The 
sum itself tends to a very definite lim- 
it, which is the solution of the prob- 
lem and is termed the integral. If the 
function is the instantaneous velocity, 
this limit, the integral of the function, 
is equal to the distance covered. If the 
integrand represents the ordinates of 
the points of the graph y — f (x), the 

b 

integral £ / (#) dx is equal to the area 

a 

bounded by our graph, the x axis, and 
the vertical lines x = a and x — b. 

here is different. The attentive reader will 
readily see which of the two meanings is used 
in any given case. 

3 - 7 To he more precise, we should have 
written 

l=n 

lim y) v ( ti ) A ti (and 

Ati, A t s , ..., A tn 0 1=1 


similarly for the second sum), but such nota- 
tion is too unwieldy. 


Naturally, not just any sum of a large 
number n of small terms tends to a 
definite limit as n oo. Say, the sum 
of n terms each of which is UY n is 
equal to Y n and tends to infinity as 
n -> oo, which means that this sum has 
no finite value. We shall try to explain 
why in our case the limit must exist. 

Let us partition the interval from a 
to b into n equal subintervals, the 
length of each being ( b — a)/n. If for the 
sake of simplicity we take the velocity 
v as being constant, we get a sum of n 
terms each being equal to vAt = 
v (b — a)/n. The total sum (that is, 
the distance covered) is equal to 
nvAt = nv (b — a)/n = v (b — a ), which 
means that it is independent of n . 
What is important here is that each sep- 
arate term diminishes in exactly the 
same proportion (in proportion to 1 hi) 
as does the number of terms n increase. 
It is also clear that for the case of a 
variable velocity and the partition of 
the interval from a to b into n equal 
small segments the same is true, that is, 
the decrease in the length of each subin- 
terval is inversely proportional to the 
growth in the number of such subinter- 
vals. For instance, when each small 
subinterval At is divided into two halves, 

A x t and A 2 t, the terms in the integral 
sum corresponding to these halves prove 
to be approximately half the initial, 
“large”, summand, but the total number 
of terms doubles, so that the order of 
magnitude of the entire sum does not 
change. 3 * 8 The reader that is not con- 
vinced by all this reasoning should do the 
exercises at the end of this section. 

This explanation is useful if we are 
dealing with the mathematical defini- 
tion of the integral as the limit of a 
certain sum. In physical problems, how- 

3 * 8 Note that in the example with n sum- 
mands each of which is equal to 1 /Y n, a 2- 
fold increase in the number of terms is accom- 
panied by a decrease in the magnitude of each 
term only by a factor of Y%- In the example of 
the sum of terms each of which is l/ra 2 , a 2-fold 
increase in the number of terms leads to a 4- 
fold decrease in the magnitude of each term, 
so that the new sum is only half the old one. 
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ever, the existence of the integral, 
that is, the limit of the “integral sums” 
considered, is as a rule quite obvious. 
For instance, there is no doubt that a 
body traveling with a finite (constant 
or variable) velocity will, over a finite 
time interval, traverse a certain (finite) 
path, that is, it will travel a certain 
distance. In what follows we will ex- 
plain how by employing the concept 
of the integral one can calculate the 
area of curvilinear figures (this question 
was discussed in brief above). Here 
too there can be no doubt that the prob- 
lem has a solution, that is, a figure 
must have an area, and so the approp- 
riate integral exists. 

Since the variable of integration can 
assume the values a and b , it is clear 
that the limits of integration have di- 
mensions and their dimensions are those 
of the variable of integration (in the 
distance-velocity example, the limits 
of integration have the dimensions of 
time). (As for the dimensions of the in- 
tegral, see the end of this section.) 

Clearly, the value of a definite integ- 
ral depends on the values of the func- 
tion under the integral sign solely 
within the domain of integration. Val- 
ues of the function outside the domain 
of integration have no effect on the mag- 
nitude of the integral and are of no in- 
terest to us. For instance, this distance 
covered, z (a, b), depends of course only 
on the value of the velocity v = 
v (t) inside the domain of integra- 
tion and does not depend on the ve- 
locity before t = a or after t = b. 

It was pointed out in Section 3.1 
that the distance can be determined by 
computing the area on the graph in 
which the velocity is a function of 
time. The problem of finding the area S 
bounded above by a curve with a spe- 
cified equation y — y (x), below by the 
axis of abscissas (the x axis), and on the 
sides by the vertical lines x = a and 
x = b (Figure 3.2.2), that is, of calcu- 
lating the area of the curvilinear trape- 
zoid ABCD , whose bases are the paral- 
lel line segments AC and BD of the 
straight lines x = a and x = b and 



whose lateral sides are the line segment 
AB of the axis of abscissas and the 
portion CD of the curve y = y(x) re- 
duces to computing the integral 

b 

S = j y {x) dx (3.2.5) 

a 

To explain this, recall Figure 3.1.2* 
Imagine that the values of an (arbit- 
rary) independent variable x are laid 
off on the axis of abscissas, the values 
of a function y (#) on the axis of ordi- 
nates, and both x and y are assumed to 
be the distances of a variable point 
M(x , y) from the axes of coordinates, 
(see Section 1.2) and in no way are relat- 
ed to the physical concepts of time and 
distance. The sum of the areas of the» 
hatched rectangles in Figure 3.1.2a; 

l=n 

is equal to Zj y ( x i- i) A#*; the same- 

i—i 

sum in Figure 3.1.2& is equal to* 

l—n 

2 y ( x i) A x t . In the limit, as Ax t 

i=i 

these sums are, by definition, equal to 
the integral, and the sum of the areas 
of the rectangles tends to the area bound- 
ed by the curve y (, x ), since the smaller 
the subintervals Ax t the closer to tho 
curve is the polygonal (step-like) line- 
bounding the rectangles. 

In conclusion we note that the def- 
inite integral depends on the integrand 
and the limits of integration, but it 
is independent of the designation of 
the variable of integration. Judge for 
yourself. Suppose that we have thfr 
integrand in the form v ( t ) = 3 1 2 + 5. 
Substituting x for t , we get v (#) = 
3z 2 + 5. When we compute the integ- 
ral, it is immaterial how the variable? 


7 * 



100 


3 What is an Integral? 


of integration is designated, the only 
important thing being over what range 
it varies and what are the values of 
the function. For this reason, 


b 


Z 


(a, b) = j v ( t ) 


a 


dt = 




or even 

b 

z(a, b) = ^ v ( o ) do 

a 

^the last form is never used, of course). 
Any letter will do for the variable of 
integration — the result will be the 
same. 

A variable which (like the variable 
of integration) does not appear in the 
final result is called a dummy variable. 
The variable of integration under the 
integral sign can be changed to any 
letter without disrupting the validity 
of the formulas. An ordinary (not dum- 
my) variable can be replaced by a dif- 
ferent letter only in all parts of a for- 
mula; for instance, in the formula 
{x + l) 2 = # 2 + 2x + 1 one cannot 

write (x + l) 2 = £ 2 + 2£ + 1, but in 

integrals we can write z (a, b) = 

b b 

^ v (i r ) dt or z (a, b) = j v {x) dx. 

a a 

Finally, let us turn to the question 
of dimensions . By definition, 

b l=n 

[y(x)dx-~ lim 2 y ( x i) &x t . (3.2.6) 

V Ax z -0 . . 

a 1 1 = 1 

If x is measured in units of e 1 and y 

in units of e 2 , then each term in the 

sum on the right-hand side of (3.2.6) 
lias the dimensions of e x e 2 (the factor 
A x x has the dimensions of e l9 and 
y fo) has thejdimensions of e 2 ), whereby 
the limit of the sum (the integral) has 
the same dimensions. For instance, if 
the velocity v (£) is measured in units 
of cm/s and t in units of s, then the 


b 

distance z (a, b) = j v (t) dt is mea- 

a 

sured in units of (cm/s)-s = cm. If 
the abscissa x and the ordinate y are 
measured in centimeters, the area 

b 

S = j* y (x) dx is measured in square 

a 

centimeters (cm* cm = cm 2 ). A transfer 
to new units of measurement, say, to 
e\ in x and to e' 2 in y, results in the 
multiplication of integral (3.2.5) by 
a factor of where e x = k and 

e 2 = k 2 e' 2 . The integral increases in 
the process by a factor of k t k 2 , but 
it expresses the same quantity as be- 
fore (only in new units, e[e 2 ). For in- 
tance, in going over from centimeters 
o millimeters, we increase the numer- 

b 

ical value of the area S = \ y (x) dx 

by a factor of 10 X 10 = 100, while 
in going over from cm/s to m/min we 
are forced to multiply the value of the 
distance (which before was measured 

b 

in centimeters) z (a, b) = j v ( t ) dt by 

a factor of (0.01 1/60) X 1/60 = 0.01 

(the new value, of course, will be ex- 
pressed in meters). 


Exercises 

3.2.1. Consider the case v = kt + s (uni- 
formly accelerated motion). Find the distance 
covered during the time interval from a to b 
by dividing this interval into m equal subin- 
tervals; take advantage of the fact that the 
terms of the sum form an arithmetic progres- 
sion. Find the limit of the sum as m oo. 
Compare the resulting expression with the 
area, equal to the distance covered, of a tra- 
pezoid in the ^-plane. 

3.2.2. Consider the case v = t 2 and for 
this case find the distance covered in the time 
interval from t = 1 to t == 2, in other words, 

2 

find the integral j* t 2 dt. To do this, partition 
l 

the interval from 1 to 2 into m equal parts 
and compute the sum 2 —or the sum 
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2 


Compare these two sums. \ Hint. 


To calculate these sums, employ the for- 
mula l 2 + 2 2 + 3 2 + ... + n 2 = 
n(n + l)(2n + l) ] 


3.2.3. Evaluate the integral ] x 3 dx. To 

1 

this end divide the interval from 1 to 2 into 
m small subintervals Ax whose lengths form a 
geometric progression , and pass to the limit 
as m -*■ oo. 


3.3 The Relationship Between 
the Integral and the Derivative 

In the preceding sections we consid- 
ered, separately, the concepts of the 
derivative and the integral. In the 
present section we examine (using the 
distance-velocity example) the rela- 
tionship between these two concepts. 
This step will unite the calculus of 
derivatives and the calculus of inte- 
grals into a single science that abounds 
with applications, the differential and 
integral calculus , or, as it is sometimes 
called, mathematical analysis . 

We assume as given and known the 
instantaneous velocity as a function of 
time v = v (£). We regard as constant 
the time t = a at the beginning of 
the motion (a train leaves a certain 
station at a known time t = a), and 
consider the distance covered in the 
time interval from a to b as a function 
of the terminal instant b. To emphasize 
this, we will denote the terminal (un- 
defined) moment not by b but say, by 
the letter u , since it is customary (but 
not mandatory) to denote variable 
quantities by the last letters of the Eng- 
lish alphabet. The result is 

u 

z(a , u) — j v ( t ) dt . (3.3.1) 

a 

We know that 

z (a, u) = z (u) — z (a) (3.3.2) 

(in the example with the train it is 
natural to assume that z (a) is zero). 
Let us take the derivative of the left 


and right members, regarding u as 
the (independent) variable and a as a 
constant, with the result that — z (a) 
on the right-hand side is also a con- 
stant and d (—z ( a))/du = 0. This yields 
dz ( a , u)ldu = dz ( u)ldu . But we know 
that the derivative of the coordinate of 
a body with respect to time is nothing 
but the instantaneous velocity of that 
body, so that dz ( u)ldu = v ( u ) is the 
velocity at time u and, hence 

- dz{a d ' u u) =v(u). (3.3.3) 

Substituting here the expression (3.3.1) 
for distance z {a, u) in the form of an 
integral, we get 

u 

-hi] »(t)dt)=v(u). (3.3.4) 

a 

This equation is the most important 
general property of the definite integral. 
In the given form, this equation is a 
general mathematical theorem. Its va- 
lidity is independent of whether v {t) 
is velocity (and the integral is distance) 
or v ( t ) is some other quite different 
quantity. For any function, say y (; x ), 
we have 

b 

-fb(\y( x ) dx ) -=y( b )’ (3.3.4a) 

a 

where we have replaced the upper 
(variable) limit of integration with 
b. 

The theorem is stated thus: the de- 
rivative of a definite integral with re- 
spect to the upper limit is equal to the 
value of the integrand at the upper limit. 

Because the theorem is so important 
(it is called the Newton- Leibniz theo- 
rem? 2 ), we give a different derivation 
of it based on a consideration of area. 
We will compute the derivative by 
first principles, that is, as the limit 
of the ratio of the increment of the 


3 * 9 In some old textbooks on differential 
and integral calculus this theorem is called the 
fundamental theorem of higher mathematics. 
This name sounds rather high-flown, but it is 
quite justified. 
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Figure 3.3.1 


function to the increment of the in- 
dependent variable. We consider 

b 

I (a, b) == j y (x) dx. 

a 


This integral is the area bounded from 
above by the curve y (#), from below 
by the x axis, on the left by the verti- 
cal line x = a, and on the right by the 
vertical line x = b (see Figure 3.2.2). 

How do we find the increment A I 
of the integral I caused by a small 
increment A b of the upper limit &? 
By the definition of an increment, 
A I = I (a, b + A b) - I (a, b). The 
area equal to the integral I (a, b + A b) 
differs from the area equal to the in- 
tegral I (a, b) in that the right vertical 
is displaced rightward by a small 
A b (compare Figure 3.3.1 with Fig- 
ure 3.2.2). Consequently, the increment 
A / is the difference between two areas: 
that with the base from a to b + A& 
and that with the base from a to b. 
A / is clearly the area of the thin strip 
that is hatched in Figure 3.3.1. The 
base of this strip on the x axis is a line 
segment of length A 6. 

The desired derivative is equal to 
the limit 


dl(a, b) 
db 


lim 

A6-*0 


A/ 

A b * 


Quite obviously, as A b tends to zero, 
the area of the strip approaches 
y (b) A 6, and the ratio A// A 6 approaches 
the quantity y(b). We have thus 
once again given a pictorial proof of 
the Newton-Leibniz theorem: 

b 

( j y(x)dx)=y (5). 

a 


The definite integral of a known func- 
tion y (x) or v (t) is a function of the 
limits of integration a and 6, that is, 
a function of two variables, a and b . 
Formula (3.3.4) or (3.3.4a) yields the 
value of the derivative of this function 
at the upper limit of integration, the 
variable b. Below, in Section 3.5, we 
will find the value of its derivative at 
the lower limit of integration, a. 3 - 10 

The definition of an integral as the 
limit of a sum, which was given in the 
preceding section, explains the role 
the integral concept plays in the solu- 
tion of physical or geometric problems: 
in computing the distance traveled when 
a body is moving with a (variable) 
velocity that is known, in determining 
the area bounded by a curvilinear tra- 
pezoid, and in other problems that lead 
to the construction of “integral sums” 
(below we will encounter many prob- 
lems of this type). But this definition 
does not yield a convenient general 
method for computing an integral or 
finding the value of the integral in the 
form of a formula, as a function of the 
limits of integration. 311 

A method for finding such a formula 
follows from the Newton-Leibniz theo- 
rem proved above, a theorem that con- 
cerns the derivative of an integral. 
Here, besides the property of the de- 
rivative of an integral, we make use 
of yet a second property of the definite 
integral: the definite integral is equal 
to zero when the upper and lower limits 
of integration coincide : 


a 

z (a, a) = j v ( t ) dt = 0. (3.3.5) 

a 


3 - 10 Here, of course, we are speaking of the 
partial derivatives of the function I ( a , b) = 

b 

j y (x) dx of two variables, a and b (see 
a 

Section 4.13). 

3 - n Only in rare cases and with great diffi- 
culty is one able to sum an arbitrary number 
of small summands present in the definition of 
the integral (e.g. see Exercises 3.2.2 and 
3.2.3). 
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This property is obvious because the 
distance is equal to zero if the time in 
transit is zero. 

The formula which yields the value 
of the integral as a function of the lim- 
its of integration will be derived in 
this fashion in Section 3.4. 

Let us now study the question of the 
integral of a derivative. In the integ- 
ral 

b 

j v ( t ) dt (3.3.6) 

<x 

the integrand may be an arbitrary func- 
tion, and of course there is no universal 
method that enables finding the integ- 
ral for an arbitrary function v ( t ). 
But suppose we are lucky and have 
found a function F such that the in- 
tegrand v ( t ) is the derivative of this 
function F : 

= (3.3.7) 

In such a favorable case everything 

turns out wonderful: here we can easily 
find the exact value of the integral. 

To write it out, we recall the approxi- 
mate expression for the increment of 
function (cf. (2.4.6)): 

A F ~ F' ( t ) to = v ( t ) M, 

or in the more common notation, which 

b 

refers to the integral ^ / (x) dx , with 
/ (x) = dF (x)!dx, 

A F ~ F' (x) A x — / (x) Ax. (3.3.8) 

The quantity in the right member of 
the equation is precisely one of the 
summands whose sum is equal to the 
integral. And so, by putting, say, 
Ax z = x t — x z _ x and, hence, A F = 
F (xi) — F (x z-1 ), we can write, ap- 
proximately, 

AF = F (x,) — F (x,_ x ) ~ / (x,) (x, 
— x i-i) = / ( x i) Ax,. (3.3.9) 

As already mentioned, (3.3.9) is ap- 
proximate and its accuracy increases 
as the difference between x x and x z-1 , 


• CCf £2 & 3 3? 5 ^ 

Figure 3.3.2 

that is, the increment Ax z becomes 
smaller. But as the difference x x — 
x z-1 = Ax z decreases, that is, as x x 
approaches x z _ 1? the difference between 
/ (x x ) and / {x l _ 1 ) also diminishes. For 
this reason, we have just as much right 
(the degree of accuracy is the same) 
to put / (x z-1 ) in the right-hand side of 
(3.3.9) as we do / (x z ), as was done 
above, where we selected the latter 
case. 

Let us now write formulas like (3.3.9) 
for all subintervals into which the 
domain of integration, that is, the 
interval from a to b , is partitioned. 
Suppose that the interval is partitioned 
into five subintervals (Figure 3.3.2) so 
that x 0 — a and x 5 = b. Then 

F (xj) — F (x 0 ) ~ / (xj) (a:, — x 0 ), 

F (x 2 ) — F (x x ) ~ / (x 2 ) (x 2 — Xj), 

F (x s ) — F (x 2 ) ~ / (x 3 ) (x 3 — x 2 ), 

(3.3.10) 

F (x 4 ) — F (x 3 ) ~ / (x 4 ) (x 4 — x 3 ), 

F (x 5 ) — F (x 4 ) ~ f (x 8 ) (x 5 — x 4 ). 

Now add them together. In the left 
members, all values of the function 
F for intermediate values of x cancel 
out leaving only 

F (x 5 ) - F (x„) = F (b) — F (a). 

On the right-hand side we have pre- 
cisely those sums with the aid of which 
we approximately expressed the in- 
tegral in Section 3.1 (expressing the 
distance z ( a , b) for a given velocity 
v ( t )). Thus, 

1 = 5 

F {b) — F (a) ~ 2 / (*i) (*i — x i-\) 

1=1 

b 

= 2 / ( x i) Ax, ~ j / (x) dx, 

a 

where / (x) = dFIdx. 
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The smaller each increment Ax, or 
the difference x x — - x M , the more exact 
the expression (3.3.9) for the increment 
A F. But as all the differences Ax x = 
x x — x x _ x decrease (and the number of 
subintervals Ax x grows without limit), 
the sums tend to the integral. There- 
fore, the equation 

b 

F ( b) — F(a ) = j / (a:) dx for f(x) = ~ 

(3.3.11) 

is now exact. 

The last statement seems to be quite con- 
vincing, but it can be justified by even a more 
exact computation. The thing is that the 
error introduced by a formula similar to 
(3.3.8) or (3.3.9) is, generally speaking, quad- 
ratic: it is approximately proportional to 
the second power of the length Ax of the in- 
terval, that is, has the form c (Ax) 2 , where 
the quantity c = c (x) depends on x but weak- 
ly and with a high degree of accuracy can be 
assumed to be a constant (all this will be 
explained in Chapter 6). Thus, for Ax = 

( b — a)/n , each expression in (3.3.10) (the 
number of such expressions must be taken to 
be n instead of five) introduces an error pro- 
portional to (Ax ) 2 = (1 In 2 ) (b — a) 2 . The total 
error obtained through summation of all such 
approximate equations will be proportional 
to n [(1/w 2 ) (b — a) 2 ] = (1 In) (b — a ) 2 , from 
which it follows that it tends to zero as n-> 

oo . 

Formula (3.3.11) establishes a re- 
lationship between the integral and 
the derivative. From this formula it 
follows that if we are able to find a 
function F whose derivative is equal 
to the integrand /, the problem of eval- 
uating the integral is solved, for all 
that remains is to substitute a 
and b for the independent variable of 
this function and find the difference 
F (b) — F (a). 

Since formula (3.3.11) is so impor- 
tant, in the sections that follow we will 
give a different derivation of it on the 
basis of a more detailed consideration 
of the properties of the integral and 
the function F . 


3.4 The Indefinite Integral 

In the preceding sections we introduced 
the concept of a definite integral as the 
limit of a sum of a large number of 
small terms. In Section 3.3 we elu- 
cidated the principal property of the 
definite integral: the derivative of a 
definite integral with respect to the 
upper limit is equal to the integrand, 
that is, 

b 

if z (a, b) — j v ( t ) dt , then ■ d * ^ 

a 

■=v(b). (3.4.1) 

We now wish to take advantage of this 
property to compute a definite integral. 

We seek a function of b whose de- 
rivative is the known function u (b). 
We denote this function by F ( b ). Then, 
by definition, 

= (3.4.2) 

This equation does not fully define 
the function F(b). We know that the 
addition of any constant to a function 
does not change the derivative of that 
function. Hence, if F(b) satisfies (3.4.2), 
so will the function G(b) — F (b) + C 
for any constant C. 

The function F(b) which satisfies 
Eq. (3.4.2) is called the indefinite 
integral of the function v ( b ). Similarly, 
if 

-^ = /(*), (3.4.2a) 

it is said that F(x) is the indefinite 
integral of the function / ( x ). This 
term reflects two properties of the func- 
tion F(x). First, the function F has 
the same derivative with respect to 
b as the (definite) integral z (a, b) y 

b 

or j / (x) dx (cf. (3.4.1), (3.4.2), and 

a 

(3.4.2a)); for this reason F is also called 
an integral. On the other hand, con- 
dition (3.4.2) or (3.4.2a) does not de- 
fine this function completely, since we 
can always add a constant quantity C 
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to it, whence the modifying adjective 
“indefinite.” 

Any solution to Eq. (3.4.2) or (3.4.2a) 
can differ from a solution F ( b ) (or 
F(x)) solely by a constant . Indeed, if 
there is another solution to (3.4.2), 
G (6), then for their difference F — G 
we have 

±lF(b)-G(b)] = v(b)-v(b) = 0. 

But only the derivative of a constant 
is equal to zero for arbitrary values of 
the argument, that is, only a body at 
rest has a zero velocity all the time. 

According to (3.4.1), the definite in- 
tegral z (i a , b) is also one of the solu- 
tion to (3.4.2). Hence, z (a, b) can 
likewise be represented as 

z {a, b) = F ( b ) + C, (3.4.3) 

where F (b) is a solution to (3.4.2) 
and C is a constant, and it only remains 
to determine the constant. To do this, 
we take advantage of the second prop- 
erty of the definite integral: it is 
equal to zero when the upper limit 
coincides with the lower limit, 

* (a, a) = 0 (3.4.4) 

(see (3.3.5)). Substituting a for b in 
(3.4.3) and using (3.4.4), we get 
0 = F(a) + C, or C = ~F(a ). From 
this we finally have 

z (a, b) = F{b) — F (a), (3.4.5) 

or 

b 

j f(x)dx = F (b) — F (a), (3.4.5a) 

a 

where F'(x)=f(x). 

It is to be noted that the “indetermi- 
nacy” of the function F(b) does not in 
any way hamper computation, with 
its aid, of a definite integral from for- 
mula (3.4.5) or (3.4.5a). Indeed, in 
place of F ( b ) take some other solution 
to Eq. (3.4.2), say G (6), which differs 
from F(b) by a constant: G ( b ) = 
F(b) + C. We will evaluate the defi- 


nite integral using (3.4.5), taking G 
instead of F : 

z (i a , b) = G (b) — G (a) 

= lF(b) + C]- \F(a) + C] 

= F(b) - F (a). 

The result is the same as above. 

In the distance-time problem it is- 
convenient to denote the indefinite 
integral by the same letter z as we used 
for the definite integral. For a given 
integrand v ( t ), the definite integral 
depends on the upper and lower limits* 
of integration, which is to say, it is 
a function of two variables: z = z (a, b). 
The indefinite integral is a function of 
one variable, which we denote by t . 
Thus, the indefinite integral z (t) i& 
a function that satisfies the equation 

z '( t ) = J 2T = '>V)- (3-4.6) 

Using this function, we can find the- 
definite integral z ( a , b) of the function, 
v ( t ) from the formula 

b 

z(a , b)— \ v (t) dt — z(b) — z (a) . 

J 

(3.4.7) 

The following compact notation is* 
used to state the difference between 
the values of one and the same function 
for two different values of the variable:- 

z(t)\ b a = z(b)-z(a). (3.4.8) 

Here, on the left is the function of a, 
dummy variable t , which is followed 
by a vertical line at the top of which, 
is the value of the variable at which. 
we desire to take the function with, 
a plus sign, and below, the value for 
which we want to take the function 
with a minus sign. 

Substituting, under the integral sign 
in (3.4.7), v ( t ) expressed in terms of 
z (t) in accord with (3.4.6) and putting; 
expression (3.4.8) into the right-hand 
side of (3.4.7) we get the identity 

b 

\ z' (t) dt = z (t) ) 6 , 

J | a 


(3.4.9> 
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or, in other notation, 




F' (x) dx = F (x) 


b 

a * 


(3.4.9a) 


It will be seen that the positions of 
and b on the left and on the right are 
the same, which is a good mnemonic 
-device for memorizing (3.4.9) or 
(3.4.9a). 

We have studied in sufficient detail 
the concept of the indefinite integral. 
It is now time to examine some exam- 
ples. Let us consider the problem of the 
distance covered during the time in- 
terval from a to b at a velocity equal 
to v (t) — t 2 . The sought distance is 
'equal to the definite integral 

b ^ 

z ( a , b) = j t 2 dt. 

a 

In this problem, the indefinite integral 
.z ( t ) is obtained by solving the equation 

iiGL= y (*_)_= i*. (3.4.10) 


But we know that d(t*)/dt = 3 1 2 , hence 
d(t 3 /3)/dt = (1/3) (3 F) = t\ Conse- 
quently, the equation is satisfied by 
the function 


z(*)=4 +C, 


(3.4.11) 


where C is an arbitrary constant that 
can be taken to be zero, since we know 
that this does not affect in any way 
the difference of two values of function 
jz for two values of the independent 
variable. 

Substituting (3.4.10) and (3.4.11) 
into (3.4.9) yields 



t s b 6 3 a 3 

3~ a = 1 F * 


The special case of a = 1 and b = 2 
yields 



T = T“ 2 ' 333 - 


Thus, using the indefinite integral, 
we obtain in a few lines the exact result 
which we laboriously approached in 
Section 3.1 by numerical computations. 

The definite integral is the limit of 
a sum of the form 

v (to) (h — t 0 ) + V (tj) (f 2 — tj) + ... 

as each term tends to zero and the num- 
ber of terms increases correspondingly. 
In an approximation of this sum, one 
has to partition the domain of inte- 
gration into a number of subintervals, 
find the approximate value of the dis- 
tance vkt in each subinterval, and 
then add these values. To obtain good 
accuracy requires numerous arithmetic 
operations. But if we know the indef- 
inite integral z (t), that is, if we know 
the function whose derivative is equal 
to the integrand u (; t ), then any definite 

b 

integral j v (£) dt is obtained immedi- 

a 

ately from formula (3.4.9). The pos- 
sibility of finding functions with a 
given derivative (indefinite integrals) 
has “suddenly” put at our disposal 
a powerful method for computing sums 
(definite integrals). 

The indefinite integral is sometimes 
called a primitive function (antideriv- 
ative). This term is understood as the 
inverse of a derivative: we are speak- 
ing of a function whose derivative is 
known. The term is used in textbooks 
where the problem of finding a func- 
tion from the known derivative of the 
function is solved before definite inte- 
grals are considered, that is, where in- 
tegral calculus is started from the con- 
cept of an indefinite integral rather 
than that of a definite integral. We 
do not use this approach here. 

An indefinite integral can always 
be expressed in terms of a definite in- 
tegral: 

t 

z (t) = C + j v (x) dx . (3.4.12) 

a 0 

Applying the rule concerning the de- 
rivative of a definite integral with 
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respect to the upper limit, it is easy 
to verify that z ( t ) given by Eq. (3.4.12) 
satisfies (3.4.6) for arbitrary constants 
C and a 0 . 

In all problems, the answer always 
involves the difference z (b) — z (a), 
and this is independent of C and a 0 . 
Therefore (3.4.12) can be written more 
compactly: 

t 

z (t) = J v (x) dx. 

This is frequently abridged still further 
to 

z(t)= j v(t)dt. (3.4.13) 

This notation is widely used and we 
will employ it, but one should bear 
in mind that, properly speaking, it 
is not correct. 312 It may be compared 
with those grammatically loose expres- 
sions that occur in everyday speech 
and are clear to all (except children and 
pedants), for example, expressions like 
“to eat another plate” or “I’m ready 
to throw the whole thing over”. No- 
tations and expressions that are not 
altogether precise but are sufficiently 
clear are frequently used in mathemat- 
ics, they do not usually create any 
difficulties for anyone. 

The notation (3.4.13) violates the 
rule by which the dummy variable of 
integration does not appear in the re- 
sult. Therefore, when using the abridg- 
ed notation (3.4.13) one should always 
bear in mind that this is a conventional 
contraction of the exact expression 
(3.4.12), in which C = z (a 0 ). 

The familiar formulas for derivatives 
yield a table of indefinite integrals: 

j dt = t, j t dt= — , j t 2 dt = ~ , 

dt_ 1_ 

* 2 “ t - 


3 - 12 Note that in the left-hand side of 
(3.4.13) we have a function z ( t ), while in the 
right-hand side we have an indefinite integral, 
that is, a family of functions F (t) + C differ- 
ing from one another by a constant (since in 
F (t) -f- C the constant C is arbitrary). 


In addition, the results of Exercises 
2.4.2 and 2.4.3 enable us to write 

$-j^- = 21 /i. jl /tdt=^Y¥. 

To verify any of these formulas, it 
suffices to find the derivative of the 
right-hand side. If in doing so we get 
the function under the integral sign, the 
formula is correct. 

Methods for finding indefinite inte- 
grals of a variety of functions are con- 
sidered in detail in Chapter 5. Thanks 
to the relationship which exists be- 
tween an integral and a derivative, we 
are able to find the integrals of a large 
number of functions. 

The problem of integration is techni- 
cally a much more complicated job 
than the problem of finding derivatives. 
The complexity is due, for one thing, 
to the fact that in the integration of 
rational (i.e. not containing radicals) 
algebraic expressions we get answers 
involving logarithms and inverse tri- 
gonometric functions. In the integra- 
tion of algebraic expressions with rad- 
icals the result is sometimes expressible 
only with the aid of new, nonelement- 
ary, functions that cannot be expressed 
in terms of a finite number of opera- 
tions involving algebraic, power, and 
trigonometric functions (see Section 5.5). 
However, the difficulties of express- 
ing integrals by formulas should not 
eclipse the fundamental simplicity and 
clarity of the integral concept. If it 
is impossible (or difficult) to evaluate 
an integral by formula (3.4.5a), it is 
always possible to evaluate an inte- 
gral by means of cumbersome, yet 
fundamentally very simple, computa- 
tions. 

In the last few years, with the advent 
of programmable pocket calculators, it 
has become very easy to calculate the 
sum of 10, 20, or even 50 terms — 
this requires 10 to 30 minutes depend- 
ing on the complexity of the integrand. 
In many cases, therefore, a direct cal- 
culation of an integral is preferable 
over trying to obtain complex formu- 
las. In the diary of the famous physi- 
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cist Enrico Fermi there is a calcula- 
tion of a rather simple definite integral. 
Notwithstanding the fact that the cor- 
responding indefinite integral could 
be expressed in terms of the inverse sine 
and the natural logarithm, Fermi pre- 
ferred to find the integral numerically, 
not employing formula (3.4.5a). 

Many books are devoted to fast 
and exact methods of numerical 
integration. The reader wishing to 
familiarize himself with these methods 
can turn to the book [15]. 

Exercises 

3.4.1. Evaluate the following integrals 

1 l.l 2 

(a) ^ t 2 dt, (b) J t 2 dt , (c) j* dt/t 2 , and 

o l i 

3 

(d) \dtiV~t- 

1 

3.4.2. (a) Using an integral, find the area 
of a right triangle having a base b and an al- 
titude h. [Hint. Put the origin of coordinates 
at the vertex of an acute angle of the triangle 
and the right angle on the x axis (Fig- 
ure 3.4.1a). Find the equation of the hypote- 
nuse in this system of coordinates and find the 
area as an integral.] 

(b) Find the area of the same triangle by 
placing the right angle at the origin and an 
acute angle of the triangle at the point (6, 0) 
(Figure 3.4. 16). [Hint. In integrating, make 
use of the obvious property of the integral of 

a sum of two terms, j (/ + g) dx — j / dx + 
g dx.] 

Remark. Do not be indignant that a lot of 
effort is put into finding the familiar answer 
S a = (1/2) bh because the method of inte- 
gration will be used later on in cases where 
elementary methods of finding areas do not 
suffice. 

3.4.3. (a) Find the area under the parabo- 
la y = ax 2 (Figure 3.4.2a) bounded by the 



vertical line x — x 0 and the axis of abscissas. 
Express the area in terms of the coordinates 
x 0 and y 0 of the end of the arc OA of the pa- 
rabola. 

(b) The same for a parabola passing 
through the origin and having a horizontal tan- 
gent at point B (x 0 , y 0 ) (Figure 3 A. 2b). [Hint. 
The answer may be obtained at once by using 
the result of Exercise 3.4.3 (a). Still and all, 
take the hard way and do all the operations in 
their requisite order, without resorting to a 
clever trick. Seek the equation of the parabola 
in the form y — kx 2 + px + q, where k, p, 
and q are found from the condition of passage 
through the points (# 0 , y 0 ) and the origin 
(0, 0) and from the condition that the tangent 
at the point (s 0 , y 0 ) is horizontal. Express the 
area in terms of x 0 and y 0 . If you are not able 
to use the result of Exercise 3.4.3 (a), then 
first try the above-described procedure and 
the result will suggest the relationship between 
Exercises 3.4.3 (a) and 3.4.3 (b). 

3.4.4. Write the expression for the area of 
a semicircle of radius r (Figure 3.4.3) in the 
form of a definite integral. [Hint. Use the 
equation of a circle, x 2 + y 2 = r 2 .] 

1 

3.4.5. Evaluate the integral ^ dx/{l-{-x 2 ) 

o 

using the trapezoid formula (see Section 3.1), 
putting first n = 5 and then n — 10. Carry 
out the computations to the fourth decimal 
place. 

Remark. The exact value of the integral is 
jt/4 ~ 0.785 398 (see Exercise 5.2.4). An 
approximate evaluation of the integral enables 
us to obtain an approximate value for the 
number ji. 




Figure 3.4.1 


Figure 3.4.3 
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Figure 3.4.5 

3.4.6. Construct the graph of the function 

X 

F (x) = J y (x) dx. The function y (x) is giv- 
a 

<en graphically (Figure 3.4.4). For values of a 
take a = 0, a — 4, and a = 8 . 

3.4.7. Construct the graph of the function 
x 

F(x) = J y (x) dx , where the (possible) graphs 
o 

of the function y(x) are depicted in Fig- 
ure 3.4.5. 

3.4.8. Construct the curves F(x) = 

X 

(* 

\ 9 (x) dx , where the functions 9 (x) are given 

b 

by the curves which appear in Figures H .6 and 
H.7 at the end of this book. Compare F (x) 
with the curves given in Figures 2.5.5 and 
2.5.6. 


3.5 Properties of Integrals 

Above we considered the simplest case 
of a definite integral having a positive 
integrand and with the upper limit of 
integration greater than the lower lim- 
it: 

b 

z (a, b) = j v ( t ) dt, v (t) > 0 and 6>a. 

a 

The integral here is clearly positive, 
since it is equal to the limit of a sum 
of positive terms. The integral has the 


simple physical meaning of a distance 
covered (if v = v ( t ) is the velocity) 
or an area (if v = v ( t ) is the equation 
of a curve). But what is the sign of 
the integral of a negative function, 
that is, in the case of v (t) < 0? 

For the time being, we leave the 
condition b > a. In the expression 
for the sum (which in the case of pas- 
sage to the limit becomes an integral), 
the factor At in each term is positive, 
the factor v (t) is negative, which means 
that each summand is negative, the 
sum is negative, hence the integral is 
negative, too. To summarize, if v (£) 
isj^negative and a < t <Z b (so that 

b 

b > a), then j v ( t ) dt < 0. In the 

a 

case of motion, the meaning of the 
answer is simple: a negative v (t) sig- 
nifies that the motion occurs in a di- 
rection opposite to the positive direc- 
tion, or the direction of increasing z 
coordinate. The distance traversed in 
the negative direction will always be 
regarded as negative. In this case, 
z decreases, z (b) < z (a). The general 
formula 

b 

z(b) = z (a) J rz(a, b) = z (a) + j v dt , 

a 

b 

or ^ v ( t ) dt = z(b) — z (a) , 

a 

remains valid, only in the second form 
both sides are negative. 

In the case of a velocity that changes 
sign , it may happen that, for one thing, 

b 

j v ( t ) dt = 0, although b > a and 

a 

b -=£■ a. This occurs if part of the time 
between a and b the body moves in 
one direction, and during another time 
interval it moves in the opposite di- 
rection, with the result that by time b 
it returns to the position it occupied at 
time a. 

Let us examine the problem of the 
area under a curve. For b > a and 
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ir=ir(t) 



Figure 3.5.1 

v (t) > 0, the integral is equal to the 
area bounded by the curve v = u (£), 
the t axis, and the vertical lines t = a 
and t = b (Figure 3.5.1a). For b > a 

b 

and v (£) < 0, we know that j v dt <Z 0. 

a 

In this case the curve lies below the 
axis of abscissas (Figure 3.5.16). Thus, 
in order to preserve the law by which 
the area is equal to an integral, it 
is necessary to regard the area as neg- 
ative if the curve v = v (t) lies below 
the axis of abscissas. 

If we take a function that changes 
sign, say, v ( t ) = sin t (Figure 3.5.2), 
then, as can easily be seen, the area 
under such a curve over an interval 
equal to the period of the function, 
that is, from t = 0 to t = 2ji, will 
be equal (by our definition) to zero. This 
means that the area under the first 
half-wave, which we assume to be pos- 
itive, and the negative area under 
the second half-wave cancel each other 
exactly. 3 * 13 

The definite integral also generalizes 
to the case when the upper limit is 

3 * 13 If our problem is to find how much 
paint is required to paint over the hatched 
portions in Figure 3.5.2, such a definition of 
area is no good. In this case we have to parti- 
tion the entire integral into parts over which v 
does not change sign (in our case there will be 
two parts, from 0 to it and from n to 2jt), 
then compute the integral over each portion 
separately, and finally add the absolute values 
of the integrals of the separate parts. 



less than the lower. In this case we will 
no longer speak of the distance, time* 
and velocity (cf. Section 3.1) and will 
regard the definition of the integral 
as a sum (see Section 3.2). Referring 
to Figure 3.5.3, we again partition the 
interval between a and b into n parts 
by the intermediate values £ x , t 2 , ... 

. . ., £ n _ x and convince ourselves that 
all the At are now negative. It is now 
easy to see that 

b a 

j v (t) dt= — j v ( t ) dt , (3.5.1) 

a b 

since in any partition of the interval 
from a to b the corresponding sums 
will differ as to signs of the subintervals 
At in all terms. 

An essential property of the integral 
consists in the fact that the domain 
of integration may be divided into parts: 
the distance covered in the time inter- 
val between a (beginning) and b (end) 
may be represented as the sum of the 
distances traversed between time a to 
c (an intermediate point in time) and 
between c and b (Figure 3.5.4a): 

b c b 

j v (t) dt= j v (t) dt+ j v (t) dt • 

a a c 

(3.5.2) 

With the aid of (3.5.1) we can extend 
formula (3.5.2) to the case where c 

lies outside the interval from a to & 

instead of inside the interval. Suppose 

tn. tn-1 ^3 tf to 

H — 1 — 1 — I — I — I 1 — H- 

6 at 

Figure 3.5.3 
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a c 6 t 
(*) 

Figure 3.5.4 


a 


& c t 
( 6 ) 


that c > b > a (Figure 3.5.4&). Then, 
obviously, 


c o c 

j v (t) dt = ^ v ( t ) dt + ^ v {t) dt. 


(3.5.3) 


Transpose the last term to the left and 
take advantage of (3.5.1). The result 
is 

c c c b 

j v dt — j v dt = j v (t) dt + j v (t) dt 


u 

= j z; dt. 


(3.5.4) 



Figure 3.5.5 
Similarly, 

6 c 

| v ( t ) dt — z(b) — z(a), f v ( t ) dt — z (c) ) 

d a 

b 

— z(a), j v(t)dt = z(b ) — z(c), 

C 

whence 

b c b 

j v (t) dt= j v (t) dt+ j i; (^) 

a a a 


In this way we get an equation that 
coincides exactly with (3.5.2). 

We can likewise consider different 
arrangements of numbers a, b , and c , 
points on the number axis (there are 
six such variants in all). The reader 
will find no difficulty in considering 
them and in convincing himself that 
formula (3.5.2) proves to be valid in 
all these cases, that is to say, irrespec- 
tive of the mutual arrangement of the 
numbers (points) a, 6, and c. 

Actually, we have derived all these 
properties of definite integrals from the 
definition of an integral as the limit 
of a sum. These properties likewise 
follow from the expression for a definite 
integral in terms of an indefinite inte- 
gral. Indeed, suppose that the indefinite 

integral ^ v ( t ) dt equals z (£). Then 

b 

j v(t)dt=z(b ) — z (a) and 

a 

a b 

v ( t ) dt =z (a) — z(b)= — j v ( t ) dt. 

b a 


The fundamental law by which the* 
derivative of an integral is equal to* 
the integrand refers to the derivative* 
with respect to the upper limit. If tha 
definite integral is regarded as a func- 
tion of the lower limit of integration,, 
with the upper limit held constant, then 
we get the answer with opposite sign:. 

j r a -=i( i* <'>*)= 

a 

(3.5.5) 

The minus sign in this formula is easy 
to understand if we regard the integral 
as an area: a positive increment A a 
clearly diminishes the area under the* 
curve (Figure 3.5.5). 314 We can form- 
ally obtain the same result by inter- 
changing the limits of integration (this- 
will introduce a minus sign) and by 


3 * 14 The area bounded by the vertical lines. 
t = a + Aa and t = b, the curve v = v (t), 
and the t axis is smaller than the one bounded 
by the lines t = a and t = b, the curve v — 
v ( t ), and the t axis. 
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musing the familiar law on the deriva- 
tive with respect to the upper limit: 


d_ 

da 


(i dt )=7^(-\ v ^ dt ) 

a b 


= —v (a) 


In connection with the question of 
the sign of the integral, we note an 
example that frequently disconcerts 
;the beginner. Let us consider 


i 


dx 
x 2 



(3.5.6) 


This equation follows from the earlier 
;found value of the derivative 


d( i/x) _ 1 

dx ~ x* * 


Is the sign of the integral correct here? 
‘Can the integral of a positive function, 
i/x 2 , be negative? Does not this sign 
^contradict the assertions made above? 

Any dismay is due to the fact that 
formula (3.5.6) is not written in proper 
fashion. If we write it as 




then we cannot say that the sign of the 
integral is always negative, since this 
also depends on the sign and value of 
•quantity C. 

Actually, all statements concerning 
the sign of the integral have referred 
to the definite integral. Let us take 




1 = (6— a) 

a b ab 


’When b > a (and both a and b are pos- 
itive), the integral is positive, as it 
should be, that is, formula (3.5.6) is 
correct and leads to a correct result 
for the definite integral. 

Looking ahead, we may note that the 

integral l dx/x 2 involves other, no 

v 

longer fictitious but real, difficulties, 
which will be considered in Section 5.2. 


3.6. Examples and Applications 

In Chapter 2 and in the preceding 
sections of Chapter 3 we examined the 
relationship between distance and ve- 
locity and the relationship between 
the equation of a curve and the area 
under the curve. These relationships 
are the concrete problems on the basis 
of which differential and integral cal- 
culus took shape in the history of ma- 
thematics. But the derivative and the 
integral are, of course, applicable not 
only to the foregoing problems but to 
an extraordinarily broad range of phe- 
nomena in the most diverse fields of 
science, technology, and everyday life. 
Actually, the derivative, the integral, 
and the Newton-Leibniz theorem (which 
establishes the relationship between 
these two concepts) constitute the most 
suitable language describing the laws of 
nature. 

Anyone beginning the study of a 
foreign language has to repeat all 
manner of simple phrases just to get 
used to the new medium: “there is a 
table in the room,” “a glass is on the 
table,” “there is a cat on the floor,” 
“there is a mouse near the cat.” The 
same goes for the student of higher math- 
ematics. He, or she, has to repeat the 
relationships between the derivative 
and the integral in a multitude of sim- 
ilar examples. One must first learn 
a foreign language before attempting 
to express ideas in it. And that is our 
task now: we have to learn to express 
familiar relationships and to formulate 
problems in the language of higher 
mathematics before trying to solve 
problems and obtain new results. 315 


3 - 15 Goethe put it this way, “Mathemati- 
cians are a species of Frenchmen; if something 
is said to them, they translate it into their own 
language and presto! it is something entirely 
different.’’ A well-known statement in a simi- 
lar vein was made by the distinguished Ameri- 
can physicist and mathematician Josiah 
Willard Gibbs (1839-1903), father of modern 
thermodynamics. During a discussion of what 
should be given preference to in curricula — 
languages, especially Latin and Greek, or 
mathematics — Gibbs said that mathematics is 
a language too. That was his only public 
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Here are a few typical examples that 
illustrate the content of the present 
chapter. The majority of these will 
be developed further in subsequent 
chapters. 

1. Acceleration and uniformly ac- 
celerated motion; the work performed 
by a force. Earlier, in Chapters 2 and 
3, we constantly considered the dis- 
placement of a body and the velocity 
of motion (as the derivative of the 
body’s coordinate with respect to time). 
But after the instantaneous velocity 
v ( t ) = dz ( t)/dt has been found and we 
know the relationship between the 
instantaneous velocity and time, we 
can ask how the velocity varies with 
time. In brief this question has been 
studied in Section 2.7. 

The derivative of velocity with re- 
spect to time is called the acceleration 
and is denoted by a: 

~ff~ =a (0 (3-6.1) 

Since the dimensions of velocity are 
cm/s or m/s, the dimensions of acce- 
leration are cm/s 2 or m/s 2 . 

We write the velocity in the form of 
the derivative of the coordinate with 
respect to time, v = dzldt , and substi- 
tute this into Eq. (3.6.1). The result is 



statement (all the more remarkable!) on the 
general aspects of teaching. Almost the same 
thing was stated much earlier by the famous 
Galileo Galilei (1564-1642), who declared 
that “the laws of Nature are recorded in the 
greatest of all books, which is always open to 
our gaze (I refer to the Universe), but you 
cannot understand it unless you first learn to 
understand the language in which it is written, 
and it is written in the language of mathema- 
tics.” (And, rather surprisingly, he goes on to 
say that “its letters fi.e. the letters of mathe- 
matics — Ya.Z. and I.Ya.] are triangles, cir- 
cles, and other geometric figures without 
which its words cannot be understood.” Thus 
Galileo regarded geometry and not differential 
and integral calculus as the language of Natu- 
re. But we must bear in mind that in Galilei’s 
time that is all there was to mathematics — the 
geometry of the ancient Greeks. Naturally, 
Galileo could not foresee the creation of higher 
mathematics by Leibniz and Newton, who 
came later.) 


Thus, acceleration is the second deriva- 
tive of the coordinate with respect to time . 

Note the exponents (the “twos”) in 
the expression d*z!dt 2 : the dimensions 
of acceleration are those of zft 2 (the 
symbol d is not a quantity). 

If we know the acceleration as a 
function of time, the instantaneous 
velocity v = v (t) may be written in 
the form of an integral of a ( t ) = 
dv Idt : 

t 

v (*) = v o + j a (0 dt. (3.6.3) 

to 

(Here v 0 = v (t 0 ) is the instantaneous 
velocity at the initial moment of time 
t = t 0 ; to verify this we need only 
substitute t 0 for t on both sides of Eq. 
(3.6.3).) For one thing, if the accelera- 
tion is constant (a = constant, the 
case of uniformly accelerated motion), 
from (3.6.3) it follows that 

t 

v(t) = v 0 + a j dt = v 0 + a(t— 1 0 ), 

(3.6.4) 

or 

v(t) = at + b, (3.6.4a) 

where b = v 0 — at 0 , that is, the ve- 
locity of uniformly accelerated motion 
is a linear function of time. 

Knowing the velocity v = v (t) of a 
motion, we can establish how the coor- 
dinate varies with time, that is, find 
the law of motion z = z (t): 

t 

z(t) = z 0 + j v(t)dt, (3.6.5) 

to 

or, in view of (3.6.3), 

t t 

z (t) = Zq-\- j -f’ j* a (t) dt J dt 

to to 

t t 

— z 0 -\-v 0 (t — t 0 ) + J £ J a ( t ) dt j dt , 

to to 

(3.6.5a) 

where z 0 = z (i 0 ) is the initial posi- 
tion (coordinate) at time t 0 (substitute 


8-0946 
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t 0 for t on both sides of (3.6.5)). Of 
course, in general form, formula (3.6.5a) 
looks unwieldy. But if a = constant, 
then the velocity v — v (t) is given by 
a simple formula (3.6.4a) or (3.6.4), 
and Eq. (3.5.5a) enables us to describe 
the motion of a body in closed form: 

t 

z ( t ) = z 0 -f- j* (at + 6) dt — z 0 -\- a ^ " 1 

to 

+ b(t—t 0 ) = ±t 2 + bt + c, (3.6.6) 

where c = —(a/2) f 0 — bt 0 + z 0 
(check to see whether the derivative 
of the right-hand side of (3.6.6) with 
respect to t coincides with the right- 
hand side of (3.6.4a)). Thus, the de- 
pendence of the coordinate z of a body 
in uniformly accelerated motion on time 
is quadratic. For example, in the case of 
motion under the action of gravity 
(free fall) we have a = — g , where 
g ~ 9.8 m/s 2 (the minus sign being 
due to the fact that the upward direct- 
ion is taken to be positive). Assuming 
a = — g in (3.6.4), (3.6.4a), and 

(3.6.6), we get the well-known formu- 
las 

* 

v(t) = v 0 — j gdt = v 0 —g(t — t 0 ) 

to 

= v 0 —gt t, 

where = t — t 0 is the new time 

variable reckoned from t = t 0 , and 

t 

z(t) = z 0 + \ [v 0 —g(t — t 0 )]dt 
%/ 
to 

tl 

= z o + j (Vo — gti) dti 

0 

= z 0 + y 0 *i— frfj. 

In connection with acceleration we 
recall that the work performed by a 
force is equal to force F times distance s 
(here we assume that the distance or 
displacement coincides with the di- 
rection in which the force acts; see 



$ 


Figure 3.6.1 

Figure 3.6.1). However, this fact can 
be employed only when the force is 
constant, a situation that almost al- 
ways is unrealistic. If the force F var- 
ies in the course of motion, then the 
whole path, or distance, from z = z 0 
to another (variable) position of the 
body, or coordinate, z has to be par- 
titioned into small intervals of length 
A z. The products FAz, obtained on the 
assumption that the force in each in- 
terval remains constant (since the A z 
are small, F does not have time to 
vary appreciably), must then be added. 
In this way we arrive at the “integral 
sum” 2 FAz, in which in order to ob- 
tain an expression for A we must pass 
to the limit as all the A z tend to zero. 
The last circumstance makes it possible 
to assume that v = dzldt is constant on 
each subinterval, with the result that 
the length A z can be expressed by the 
product of v by the time At which it 
takes to traverse A z. Hence, the sum 
S Ft* is replaced with the sum 
2 FvAt, which enables representing 
the work A either in the form of an 
integral with coordinate z as the vari- 
able of integration or in the form of an 
integral with time t as the variable of 
integration: 

z t 

W--J* Fv dLv. /r \£. J >, ; 

Zo to 

Now we recall that the force F is 
equal to the mass of the body, m, times 
the acceleration a = dvldt (Newton’s 
second law). This enables us to rewrite 
(3.6.7) thus: 

* 

A = m ^ ~j-vdt. (3.6.8) 

to 

Since, obviously, the function v (dvldt) 
is the first derivative of v 2 /2 (see Sec- 
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tion 2.3), we finally obtain 

A=m (i T 1 ' < 3 - 6 - 9 ' 

where v = v (z) and v 0 = v ( z 0 ): the 
work A done by a force on a body is 
equal to the increment mv 2 I 2 — mv\!2 
of the kinetic energy mv 2 / 2 of the body 
(of course, both the work A and the 
increment of the kinetic energy may 
prove to be negative simultaneously). 

For a further detailed investigation 
of the questions touched on here, see 
Chapters 9 and 10. 

2. Stretching of a wire. Consider a 
rod one meter long, 0.4 cm in diameter, 
and made of copper. This piece of wire 
is hung by one of its ends. The ques- 
tion is: how much will the rod stretch 
under its own weight? The stretching of 
the wire is governed by Hooke's law 313 
which is valid for not too large forces 
(and elongations) and states that the 
deformation (and stretching) l is di- 
rectly proportional to the applied force 
F and the length of the wire , L , and is 
inversely proportional to the cross-sec- 
tional area S: 


l 


1 FL 
E S ’ 


(3.6.10) 


where £ is a constant factor depending 
on the material of the wire and known 
as Young's modulus . 

Naturally, we cannot replace L with 
the length of the rod, 1 m (100 cm), 
and F with the weight of the rod, which 
can easily be found from the volume of 
the rod and the density of copper, 
since the forces acting on different por- 
tions of the rod differ from point to 
point: while for point A at which the 
rod is suspended this force is indeed 
equal to the weight of the entire rod, 
for the other (free) end B this force is 
simply zero, since this point is under 
no load (Figure 3.6.2). For this reason 
we must partition the entire wire 
(with coordinate z being reckoned from 


3 - 16 Robert Hooke (1635-1703), an out- 

standing English scientist, chemist, and mathe- 

matician and a contemporary (and in many 

cases a scientific opponent) of Sir Isaac Newton. 



the free end B) into small sections of 
length A z each; for each such section 
at a distance z from B , the force may 
be assumed constant and equal to the 
weight of the part of the rod from point 
B upward (the length of this part is 
z): F — p gSz, where p is the density 
of copper, g the acceleration of grav- 
ity, and Sz = V the voluble of the 
part of the rod under consideration. 
Thus, to this small section of the wire 
of length A z there corresponds an 
elongation 

AZ = —■ as .££. zAz, (3.6.11) 


and the total elongation of the entire 
wire is obtained by summation (or 
integration) of all such “elementary” 
elongations: 


100 



0 


pg (100) 2 
E 2 


(3.6.11a) 


Substituting the numerical values 
g = 980 cm/s 2 , p = 8.9 g/cm 3 , E = 
9.8 X 10 13 g/cm-s 2 , we obtain 




980X8.9X10 4 * 

2X9.8X10 13 


-J4.5X10’ 6 


cm 


(note that the factor 100 2 = 10 4 in 
the numerator has the dimensions of 
cm 2 ). 

3. Water flow from a vessel. Picture 
a vessel of arbitrary shape (Figure 3.6.3) 
with water flowing out. The mass of 
liquid in the vessel at a given instant 


8 * 
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Figure 3.6.3 

of time is equal to M , which is a func- 
tion of time: M = M (i t )• The liquid 
collects in another vessel; the mass of 
liquid here is m (£). We denote the 
mass of liquid flowing out of the first 
vessel in unit time by W — W (£); 
this quantity is known as the flow rate 
and has the dimensions of g/s. The 
quantities m, M, and W are connected 
by the following relations: 

- W(t), -^=W(t). (3.6.12) 

These differential, that is, involving 
derivatives, relations can be written 
in the form of integrals. Suppose that 
at a certain initial moment t 0 the mass 
of the liquid in the first vessel was 
M (t 0 ) = Af 0 , while the second vessel 
was empty, or m (t 0 ) = 0. Then 

m (t t ) = j W (t) dt , 

(3.6.13) 

u 

M (t t ) = M (t 0 ) - j W(t)dt. 

to 

Thus, the mass of liquid at a definite 
time t x is expressed in terms of an in- 
tegral in which the variable of inte- 
gration t runs through all values from 
t 0 to t. 

It will be convenient in (3.6.13) 
to replace t x with t and rename the 
variable of integration (using the fact 
that it is a dummy variable) and call 
it, say, tau (which is the Greek letter 
corresponding to the Latin t). We then 


have 

t 

m (t) = j W (t) dr , 

to 

(3.6.13a) 

t 

M ( t ) = M (t 0 ) -jw(r) dr. 

to 

Ordinarily, this is simply written as 

t 

m(t) — j W ( t ) dt 9 

to 

(3.6.13b) 

t 

M(t) = M (t 0 ) - f W(t)dt, 

J 

U 

but remember that the t under the 
integral sign has a different meaning 
from the independent variable t in 
M (£) and m ( t ), which coincides with 
the upper limit of integration. In this 
respect, the notation (3.6.13) and 
(3.6.13a) is more exact than (3.6.13b). 

The formulas given above correspond 
to an experiment in which M and flow 
rate W are measured at distinct mo- 
ments of time. Often, however, the 
problem is stated differently: the flow 
rate W depends in some known fashion 
on the pressure, that is, the height of 
a column of liquid, h . In turn, given 
a definite shape of vessel, the function 
h is a function of M. Thus, we know 
the flow rate if as a function of the 
quantity of liquid in the vessel, W = 
W ( M ). Then (3.6.12) takes the form 

™L= -W(M). (3.6.14) 

Equations of such a type, which con- 
nect the unknown function (in our case 
the function M = M (£)) with its der- 
ivative are called differential equations ; 
thus, Eq. (3.6.14) is a differential equa- 



Figure 3.6.4 
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tion for the unknown function M (t), 
that is, a function that will be found 
when we solve the equation. The topic 
of differential equations, as well as 
the solution to this specific problem, 
will be considered in Chapter 6. 

4. Capacitor. We denote the charge 
accumulated on a capacitor C (Fig- 
ure 3.6.4) (the quantity of electricity) 
by q . In the SI system of units (see 
Appendix 5), q is measured in coulombs 
(abbreviated C). The electric current 
; flowing in the wire is the quantity 
of electricity flowing in unit time and 
is measured in amperes (abbreviated A). 
One ampere is a current of one coulomb 
per second: 1 A = 1 C/s; thus, the 
quantity of electricity has the dimen- 
sions of A* s, since in the SI system of 
units the ampere is a base unit. 

The charge on a capacitor 3 17 and the 
current are connected by the equation 



(the positive direction of the current is 
indicated by the arrow in Figure 3.6.4). 
If the variation in current flow with 
time, ; = / (£), is given or has been 
found via experiment, then we can 
write the integral relation 

ti 

q (t t ) = q (t 0 ) + j / (t) dt. (3.6.15a) 

to 

If the capacitance C of the capacitor 
is given (note that the capacitance and 
the capacitor are designated in Figure 
3.6.4 by the same letter), then the 
voltage drop across the capacitor may 
be expressed in terms of q in the fol- 
lowing manner: cp c = qlC, and the 
voltage drop across the resistance R is 
(pi? = E 0 — cpc = E 0 — q!C, where E 0 
is the battery voltage. By Ohm’s law, 
the current flowing through resistance 
R is / = i?" 1 (E 0 — ?/C). Using (3.6.15), 
we arrive at the differential equation 

7T=Tr( E °-i)- < 3A16 > 

3 .i7 \y e use term charge on the capaci- 
tor to mean the quantity of positive electricity 
on the left-hand plate of capacitor C in Fig- 
ure 3.6.4 expressed in coulombs. 



Figure 3.6.5 

Problems involving capacitors are dis- 
cussed in detail in Part 2 of this book. 

5. Atmospheric pressure. Imagine a 
vertical column of air with constant 
cross-sectional area S cm 2 . The density 
of the air is p g/cm 3 and depends on 
the altitude h above the earth’s surface. 
The volume of a thin layer contained be- 
tween h and h + Ah is equal to SAk 
(Figure 3.6.5). Inside this thin layer* 
the density p {h) may be regarded as 
constant, which is precisely why we* 
took a thin layer. In the given instance, 
Ah may be pictured as 1 meter or 
10 meters or even (with a slightly lower 
accuracy) as 100 meters, since air den- 
sity varies roughly by 12 to 14% per 
kilometer of altitude. 

Since the volume of the layer of air 
Ah thick is S Ah, the mass of air con- 
tained in this layer, by the definition 
of density, is Am = pSAh. To find 
the mass of air in a column extending 
from h x to h 2 , we must construct the 
sum of expressions of the type pSAh r 
with S constant and p varying with 
altitude h . The sum is extended over 
all layers Ah into which the column 
of air is partitioned. Decreasing all 
Ah without limit (that is, sending them 
to zero), we arrive at the integral 

h t 

m (h^ h 2 ) = S j p» dh. (3.6.17) 

hi 

The mass of air in a column from the 
earth’s surface (h = 0) to an altitude 
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In is 

h 

m (0, h) = S j p (h)dh. (3.6.17a) 

0 

The mass of air above a given altitude 
h is 

oo 

m = S ^ p(h)dh, (3.6.17b) 

h 

"where the symbol oo in the upper limit 
takes the place of a very large number 
H such that any subsequent increase 
in this quantity does not substantially 
-change the integral. 

The pressure P at some altitude h 
multiplied by the area S is equal to 
the force with which the entire column 
of air above h is attracted to the earth. 
The force of gravity is equal to the 
mass multiplied by the acceleration 
of gravity g, 318 whence 

oo 

P (h) — j gp{h)dh . (3.6.18) 

h 

Using formula (3.5.5), we get the 
differential equation 

1 wifi). (3.6.19) 

This formula could have been written 
straight off by considering the equi- 
librium of a thin layer dh acted upon 
from below by the pressure P (h) and 
from above by the pressure P (h + dh); 
the resultant of these two forces bal- 
ances the attraction to the earth of 
the mass of air in the layer Ah. 

We will continue on this subject in 
Chapter 11. 

6. Volume. Let us employ integral 
•calculus to compute the volume of 
bodies (solids). Let us slice a solid of 
total volume V (Figure 3.6.6) by planes 
z — constant into thin layers. In this 


3 - 18 In the majority of cases we can assume 
that the acceleration of gravity g does not 
alter over the altitudes h considered in this 
problem and that the earth has a flat surface, 
although both assumptions do not in any 
way change our arguments. 



Figure 3.6.6 


problem we will employ a system of 
coordinates in space specified by the 
x axis and the y axis in the horizontal 
plane xOy and by the vertical z axis 
(the third coordinate axis in space). 
The volume AF of a thin layer bounded 
by the planes corresponding to the 
values z and z + Az on the z axis and 
parallel to the xOy plane is approximate- 
ly equal to the product of the cross- 
sectional area S (z) by the thickness 
of the layer, A z. Thus, if we know the 
area of a cross section of the solid 
cut by a horizontal plane, S = S (z), 
then the volume of the solid can be 
computed by the formula 


Zi 

F=jS(z)dz, (3.6.20) 

z 0 


where z 0 and z 1 are the smallest and 
greatest values of the z coordinate in 
our solid. 

Let us apply this formula to a reg- 
ular quadrangular pyramid with its 


Figure 
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vertex at the origin of coordinates and 
with its axis of symmetry directed along 
the z axis (Figure 3.6.7). Let the al- 
titude of the pyramid be h and the base 
(at the top of the figure) a square with 
side a. From geometry we recall that 
a cross section of a pyramid cut by a 
horizontal plane at an altitude z is a 
square, the side b of which is to a as 

z is to h , that is, b = b (z) = a (zlh). 

Hence, the cross-sectional area is 

S (z) = b 2 = (a 2 /h 2 ) z 2 . The volume of 
the pyramid is 


V = 





a 2 

H 2 


h 

j z 2 dz. 

o 


Let us take advantage of the result of 
Section 3.4: 


* 


z 2 dz — 



\ 


z 2 dz = h 3 

O 


We then get an expression for the vol- 
ume of a pyramid: 


V = — — h 3 = 
v h 2 3 


3 


a 2 h 


We found that the volume of a pyramid 
is equal to one third the product of the 
area of the base by the altitude of the 
pyramid . To derive this formula in 
solid geometry without the aid of in- 
tegrals is a rather complicated job. 

We will return to the question of the 
use of integral calculus for the com- 
putation of volumes of solids in Chap- 
ter 7. 


Here is another example, and it is much 
more striking than the preceding examples. 
We wish to find the volume of a cylindrical 
“ hoof ”, that is, the part cut off from a (right 
circular) cylinder by a plane that intersects the 
■cylinder along the diameter AB of the cyl- 
inder’s base and forms an angle of 45° with 
the base (Figure 3.6.8). For the sake of simplic- 
ity we take the radius of the cylinder equal 
to unity (cf. Exercise 3.6.3). The cross section 
of the “hoof” cut by a plane perpendicular to 
diameter AB and distant OM = x from the 
center 0 of the base is a right isosceles tri- 
angle (the right triangle MPQ with acute 
angle Z OMP = 45°) in which the length of 
side MP is V OP 2 — OM 2 = Vi — x 2 . But 
then PQ “ PM = ]/T — x 2 and S A MpQ — 



Figure 3.6.8 


S (x) = (1/2) MP X PQ— (1/2) (1 — x 2 ) . 
Whence, in view of formula (3.6.20), where 
now we must write x instead of z, the volume 
of the “hoof” is 

1 1 

v = j Y (1 — x2 ) dx — J (1 — z 2 ) dx 
-1 -1 


~K J dx - S x * dx ] 

-i -i 


=t[<»-<-*»-(-t f)] 

-U«-4K 


( here we employed the obvious properties 
\ b b 

of the definite integral; J cfdx—c J / dx , 


a a 

b 


where c is a number, and J (/ + g) dx = 


a 

b b 

j f d*+ j g dx'j . 
a a 

Note that the volume of this “round” solid 
is a rational fraction and in no way is con- 
nected with the number n. 


Exercises 

3.6.1. Derive the formula for the volume 
of an arbitrary pyramid using the] properties 
of parallel sections. 

3.6.2. Find the volume of a (right circular) 
cone of altitude h and with a base that is a 
circle of radius R. Will the result change if 
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the (circular) cone is oblique, that is, if its 
vertex does not lie exactly over the center of 
the base 

3.6.3. Find the volume of a “hoof” cut off 
from a cylinder of radius R by a plane pas- 
sing through the diameter of the cylinder’s 
base and forming a known angle a with the 
base. 

3.6.4. Use the formula for the volume 
of cylindrical “hoof” and evaluate the inte- 

1 

gral j y Vi — y 2 dy. 

o 

* * * 

In Chapters 2 and 3 we considered the 
concepts of the derivative and the in- 
tegral, some of their simpler proper- 
ties, and the relationship between the 
integral and the derivative; in other 
words, we introduced the framework in 
which the concepts and theorems of 
the so-called higher mathematics must 
be developed. The techniques involved 
in practical computation of derivatives 
and integrals of various functions will 
be discussed in Chapters 4 and 5. Only 
the simplest examples were illustrated 
in Chapters 2 and 3. 

A word of warning to the reader: 
do not get into the habit of measuring 
the significance (and even the difficulty) 
of any section of mathematics by the 
number of formulas, their complexity, 
and unwieldiness. Actually, the most 
important thing and the most difficult 
thing is the mathematical statement 
(or formulation) of a problem in the 
form of an algebraic equation or an 
integral or even a differential equa- 
tion, and not the transformations that 
follow. That is where our attention 
should be focused. 

Almost every physicist knows that 
those pieces of research that he did not 
succeed in completing (and which other 
workers did complete) were left undone 
because he confined himself to a gen- 
eral reflection and did not find the 
courage to write down the equations 
and formulate the problem mathemat- 
ically. The computational difficulties 
in a properly posed problem with a 
clear-cut physical content are always 


surmountable, at least via approximate 
procedures if not by precise methods. 

If the last three sections of this chap- 
ter have appeared to be difficult, the 
reader will do well to reread Chapters 
2 and 3. 

The rise of “higher mathematics”, that is, 
differential and integral calculus, was a turn- 
ing point in the history of civilization: it 
gave man a powerful tool for analyzing pro- 
cesses of many different kinds, for developing a 
fundamental explanation of physical phenome- 
na, and for constructing a scientific picture of 
the world. Actually, it was the thinkers of an- 
cient Greece, primarily the genius of Archi- 
medes of Syracuse (287-212 B.G.), who skill- 
fully solved the first problems of differential 
and integral calculus. Archimedes successful- 
ly applied mathematics in designing machi- 
nes and devices (the weapons of war which he- 
invented struck terror into the Romans laying 
siege to his native Syracuse). Archimedes 
knew, for example, how to draw tangents to 
curves and how to calculate an area bounded 
by curves. He solved the problem of drawing a 
tangent to what came to be called the Archi- 
medes spiral, that is, the line described by a 
snail crawling steadily along the spoke of a 
wheel turning at a constant rate, and also the 
roblem of squaring the parabola, that is, 
nding the area of a segment of a parabola. 
Today problems of this kind can be solved' 
without difficulty by any college student (or 
even a student of higher school), but in 
those remote times they were within the pow- 
ers of only a giant like Archimedes. There 
was no general method for solving such prob- 
lems. Each one required great effort. 

For that matter, the primitive level of tech- 
nology in ancient Greece did not require the 
solution of problems that called for great in- 
ventiveness but were devoid of applications 
outside mathematics: as a rule, the thinkers of 
ancient times regarded mathematics not as an 
effective method of solving practical problems 
but only as a theoretical science whose perfec- 
tion reflected the profound harmony of the 
world, yet only explained the world in a pu- 
rely philosophical sense. (Archimedes was 
practically the only exception in this respect 
because he closely linked mathematics with 
mechanics and physics. But he was a genius; in 
this, as in many other respects, he was far 
ahead of his time.) 

The static nature of life in ancient Greece,, 
where hardly any machines were known and 
life centered around city squares, palaces, and 
temples decorated with beautiful statues as 
immobile as the temples and city squares* 
themselves, gave rise to metaphysical think- 
ing (rigidity) that was so characteristic of 
the mathematics of antiquity: it was not 
customary to regard processes in flux. They 
limited themselves to unchanging “states,” & 
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reflection ox which were, for the famous geo- 
meter Euclid of Alexandria (early third centu- 
ry B.C.), merely chains of congruent triangles, 
which occurred frequently in his arguments. 

With the flowering of Italian cities in the 
15th and 16th centuries and the appearance of 
the first factories, which heralded the imminent 
rise of machine-based production, the static 
metaphysical thinking of the ancients became 
inconceivable. In this period it could only 
hinder much needed scientific progress. The 
great Galileo Galilei (1564-1642) was the first 
to proclaim publicly that mathematics provid- 
ed the key to the mysteries of the Universe 
(see footnote 3.15). Under Galileo’s influence, 
his pupils— Evangelista Torricelli (1608-1647) 
who discovered the principle of the barome- 
ter, and the geometrician Bonaventura Ca - 
valieri (1598-1647) who continued the work of 
Archimedes, for whom their teacher had such 
great affection— solved many specific pro- 
blems which today lie within the province of 
higher mathematics. For one, Cavalieri evol- 
ved a method of calculating volumes that is 
very similar to those described above. Galileo 
was the first to see that the problem of deter- 
mining the path of an object from its velocity 
practically coincides with the problem that 
had so interested Archimedes, that of deter- 
mining the area of curved figures. Continuing 
Galileo’s work, Torricelli established that the 
inverse problem, that of finding the velocity 
from the path , was akin to the problem of dra- 
wing a tangent to a curve . But at that time 
there did not yet exist general methods of solv- 
ing such problems; there was no common al- 
gorithm to enable one “to calculate without 
thinking.” 

Nor did Johann Kepler (1571-1630), that 
outstanding astronomer and mathematician 


whose discoveries played such a big role ins 
building a scientific picture of the world, have- 
any such method. Kepler was indisputably the- 
leading master of integration in his day 
(though it was not called that yet). In 1614, 
when Kepler married a second time, he had 
occasion to buy a good deal of wine for the 
wedding reception, and he saw how difficult 
it was to estimate the cubical contents of wine- 
casks of different shapes from the base radius 
and the height. The problem intrigued him. 
The following year, 1615, he published his- 
Nova Stereometria Doliorum Vinariorum (The 
New Science of Measuring Volumes of Wine 
Casks), “with addenda to Archimedean stereo- 
metry,” as he also informed the reader. Here he- 
collected a large number of problems in de- 
termining the volume of bodies limited by 
curved surfaces and displayed great ingenuity 
in developing formulas for such volumes, which 
are today established by integral calculus (see* 
Section 7.10). 

Rene Descartes (1596-1650) was truly the- 
forerunner of higher mathematics. Soldier, 
diplomat, natural scientist, and abstract think- 
er, he was the father of analytic geometry (the 
method of coordinates in geometry; see Chap- 
ter 1) and a profound philosopher who decla- 
red that the world was knowable. He stated 
the basic principles of dialectics and the place 
occupied in life by various processes whose* 
study, he believed, constituted the chief aim 
of mathematics. Descartes substantially im- 
proved the symbols and language of mathemat- 
ics, giving them their present-day appear- 
ance. This greatly stimulated further prog- 
ress and democratization of mathematical! 
knowledge. 

Another French scientist, Pierre Fermat 
(1601-1665), a lawyer by profession and a pro- 
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Ttene Descartes 

[found amateur mathematician, was a con- 
temporary and, in a way, a rival of Descartes. 
Working independently of Descartes, Fermat 
^somewhat earlier elaborated (though it was 
published later) a system of using, in geo- 
imetry, the methods of coordinates and alge- 
braic computations now known as analytic 
^geometry. Fermat’s exposition of the subject 
was perhaps closer to the modern one than 
■that of Descartes; and the fact that the writ- 
ings of Descartes have come to hold a much 
more significant place in the history of science 
is connected primarily with his active teaching 
and his more advanced system of notation. 
{The situation was much the same in the devel- 
opment of differential and integral calculus.) 
’Simultaneously, Fermat occupied himself with 
problems that are now part of mathematical 
analysis. He evolved the concept of the differ- 
ential (see Section 4.1 below) and the overall 
idea that the maxima and minima of a (smooth) 
function must lie at the points where the (first) 
derivative of the function vanishes, and also 
carried out ingenious calculations of the values 
of some integrals (e.g., the method of deducing 
formula (5.2.1) indicated in Exercise 3.2.3). 
Descartes, too, made contributions to the field 
*of analysis. 

Christian Huygens (1629-1695) of the Neth- 
erlands (a junior contemporary of Descartes 
and Fermat) created the wave theory of 
light and was noted for his work in the appli- 
cation of mathematics to mechanics and phys- 
ics (e.g., he produced the strict mathematical 
theory of pendulum clocks (see Section 7.9)). 
In his investigations, however, Huygens used 
the archaic methods of thinkers of anti- 
quity, Archimedes for one, because he believed 
that a newer method would yield no advan- 
tages inasmuch as he could solve any prob- 


Pierre Fermat 


lem “in the old way.” (True, but you had to 
have the brains of Huygens to do that.) 

Two outstanding thinkers of the 17th cen- 
tury, the Englishman Sir Isaac Newton (1643- 
1727) and the German Gottfried Leibniz (1646- 
1716), are rightly regarded as the real found- 
ers of higher mathematics. Both indisputab- 
ly rank among the most profound scientists 
that the world has ever known. To them we 
owe a coherent exposition of the new calculus, 
a chain of formulas for finding the derivative 
of any given algebraic function without any 
difficulty and a complete understanding of the 
connection between the derivative and the in- 
tegral and the significance of that connection, 
which provides a general algorithm for cal- 
culating integrals by turning to a list of deriv- 
atives (see Chapters 4 and 5). 

Leibniz was a truly versatile scientist: he 
delved into philosophy, philology, history, 
psychology (in which he was one of the pio- 
neers of penetration into the sphere of the 
subconscious), biology (he was one of the pre- 
cursors of the theory of evolution), geology, 
mining, mathematics, and mechanics (he evol- 
ved the concept of “living force,” that is, kinet- 
ic energy). At the same time he was active in 
politics and diplomacy (he strove to reconcile 
the German principalities because he foresaw 
their eventual union into a single state, and 
dreamed of uniting the Catholic and Protestant 
churches). He was also an organizer of scien- 
tific academies; for one, he was the founder 
and first president of the Prussian Academy of 
Sciences in Berlin, and he had several talks 
with Peter the Great of Russia about found- 
ing a Russian academy of sciences, which 
was later established in full conformity with 
his suggestions and wishes. The mathematical 
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Christian Huygens 


analysis of Leibniz [to whom we owe, among 
other things, the modern terms “derivative” 
(Ableitung in German) and “integral,” 3 - 19 ] was 
of a form much like that adopted in our book. 
Leibniz designated the (first) derivative of a 
function y = / (x) by the symbol dyidx\ he 
understood the “differentials” dy and dx as 
the “limiting” values of the increments A y 
and Ax at which we arrive by an unbounded 
reduction of Ax (neither the word “limit” 
nor its concept existed in the mathematics of 
Leibniz). This approach corresponded to in- 
troducing the concept of the derivative as the 
tangent of the angle formed by the tangent 
line to the graph of the function and the x 
axis: to a small Ax there corresponds a small 
triangle MNP (see Figure 7.9.1), where 
tan a x = NP/PM = Ay/Ax\ when Ax is re- 
duced without limit, the increments Ax and Ay 
are replaced by the differentials dx and dy , 
and the secant MN of length ds (the differen- 
tial of the arc length; see Section 7.9) is re- 
placed by the appropriate tangent line (Leibniz 
-called such a triangle the characteristic tri- 
angle). 

Leibniz emphasized in every possible way 
the algorithmic aspect of the new calculus 
and the system of rules that automatically 
guarantee a correct result when seeking deriva- 
tives. He also worked out a method of handl- 
ing differentials (dealt with here in Chap- 


3 * 19 At first Leibniz simply spoke of the 
sum (and also “summational calculus” instead 
of “integral calculus”); he enthusiastically 
adopted the terms integral (from the Latin 
integer , meaning whole) and integral calculus 
proposed by his pupils, the brothers Jakob 
and Johann Bernoulli. 



Isaac Newton 

ter 4); use of this method provides a recipe for 
calculating derivatives. Leibniz used the mod- 
ern symbol j / (x) dx to designate the inte- 
gral of the function y = f (x). 

Leibniz was a tireless teacher, and that 
plus his felicitous system of notation and terms 
resulted in the universal acceptance of higher 
mathematics in the form he developed it. 
Leibniz is also considered a classic in the 
field of philosophy (here he did much to 
extend and to amplify the work done by 
Descartes). He also deserves credit for many 
profound ideas that partially became reality 
only in the 19th and 20th centuries: e.g., the 
idea of a “geometrical calculus,” from which 
the modern vector calculus later arose; his 
attempts to “algorithmize thinking,” in which, 
as Leibniz wrote, the two sides in a contro- 
versy would no longer have to conduct lengthy 
debates inasmuch as one of them could always 
say to the other, “Well, let us verify which of 
us is right, let us calculate, my dear sir.” 
The rough outlines of this kind of “proposition- 
al calculus,” found in the papers of Leibniz, 
closely resemble the mathematical logic of 
the 19th and 20th centuries. 

Newton arrived at the same ideas as Leib- 
niz quite independently and even somewhat 
earlier. He was perhaps more of a physicist 
and astronomer than a mathematician; his 
contributions to the birth of physics and to 
the rise of a new method in natural science 
cannot be exaggerated. Joseph Louis Lagrange 
(1736-1813), the distinguished 18th and 19th 
century mathematician and investigator in 
the field of mechanics (we will discuss his 
work later on) once said of Newton: “He is 
the most fortunate of men, for the system of 
the world can be discovered only once.” The 
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same idea is conveyed in the famous “Epitaph 
for Sir Isaac Newton” written by his contem- 
porary Alexander Pope (1688-1744): 

Nature, and Nature’s laws lay hid in 
night: God said, Let Newton be\ and all 
was light. 

Newton, of course, identified the derivative 
with velocity: he regarded its properties as 
the physical properties of velocity. Yet the 
formal mathematical theory of derivatives (and 
also integrals), founded on a variation of the 
theory of limits, was not alien to him either. 
However, in the absence of a definition of the 
term limit, the theory did not make his con- 
structions more flawless than the (logically not 
indisputable) manipulations of Leibniz with 
infinitesimals, but only made them more un- 
wieldy; also the fact that Leibniz clearly 
exerted a greater influence on European mathe- 
matics than Newton may be connected with 
this. 

Newton gave the name of fluxion to the de- 
rivative, and the initial function from which 
the derivative was calculated was called the 
fluent (from the Latin fluere , to flow), empha- 
sizing that the quantities under consideration 
were variable; the fluxion arose as the rate of 
change of the fluent, while the fluent was re- 
stored from the fluxion as the path from the 
speed. Newton began his exposition of the 
analysis with two basic problems, to which 
all the others are reduced. 

1. Knowing the length of the path travers- 
ed, find the speed of motion over a fixed time 
interval. 

2. Knowing the speed of motion, find the 
ength of the path traversed over a fixed time 
nterval. 


These are obviously the problems of finding 
the derivative from the given function, and of 
finding the function from its derivative, that 
is, the problem of calculating an indefinite 
integral. The fact that these two problems (in 
their geometrical presentation, the drawing of 
a tangent to a given curve and the calculation 
of the area hounded by the curve) are inverses 
of each other had evidently been discovered 
first by Newton’s teacher, Isaac Harrow? (1630- 
1677), who subsequently resigned his post as 
Lucasian Professor of Mathematics at Cam- 
bridge (a rare case in the history of science!) in 
favor of his brilliant pupil because he felt 
that Newton was more worthy of the post 
than he. However, it is not at all by accident 
that the connection between derivatives and 
integrals is named after Newton and Leibniz 
instead of after Barrow, for only these two 
great scientists realized the full extent of the 
mutually inverse nature of the operations of 
differentiation and integration, which had 
been discovered by Barrow (and also, inde- 
pendently, by Leibniz), and the possibility 
of making this fact the foundation for a broad 
calculation of derivatives and integrals. It is 
to the far-sightedness of Newton and Leibniz 
that mathematical analysis owes the rapid 
progress that began immediately after the 
first publications and statements by these two 
reat scientists and which truly marked the 
awn of a new era, the era of great scientific 
discoveries, the era of a sweeping assault on 
Nature’s secrets to promote human welfare. 

Priority in applying differential and inte- 
gral calculus to fathom Nature’s secrets un- 
doubtedly goes to Isaac Newton, who put 
forward the general idea that the laws of Na- 
ture must have the form of differential equations- 
linking the functions that describe the pheno- 
menon under study. We owe to Newton the 
formulation of the basic equations of motion 
(for more details see Chapters 9 and 10), 
which he then applied to deduce the law of 
gravitation and to study the motion of cele- 
stial bodies and the fairly complicated pro- 
blems of celestial mechanics (see Lagrange’s 
above-cited assessment of Newton). In effect, 
Newton’s work gave birth to a theoretical 
physics based on the use of a profound mathe- 
matical apparatus, which physics, in its sub- 
stantive part, has gone far beyond the limits 
of the problems which Newton grappled with 
and, in its mathematical part, rests on the 
whole body of modern science, many times 
surpassing the differential and integral calcu- 
lus created by Newton and Leibniz. 

Newton designated a fluxion (derivative) 
by a dot placed above the letter symbolizing 
the initial function. For example, he wrote 

the derivative of the function y == f (x) as y. 
(This notation is retained today in mechanics, 
but we will not use it.) Newton proposed des- 
ignating the operation of a transition from a 
fluxion to a fluent by a dot placed underneath: 
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thus, if y = x 2 and y 1 = 2x, then, inthissys- 
tem of notation, y = y 1 and y l = y. The 

symmetry of these designations makes them 
attractive. The theorem of the connection 
between the derivative and the integral can 

be written as follows: y = y, where the expres- 
sion y can be understood in two ways: as the 
fluxion of the function y and as the fluent of 

the function y. But today this appealing nota- 
tion has only historical significance (besides, 
Newton himself was not particularly consistent 
in using it). The designation y ' for the deriva- 
tive of a function was introduced by the 
French mathematician Augustin Cauchy (1789- 
1857), whom we will be hearing more about 
later. 

One of the regrettable episodes in the hi- 
story of science was the bitter controversy be- 
tween Newton and Leibniz, conducted chiefly 
by their admirers and pupils and not by the 
two outstanding scientists themselves. 3 * 20 It 
was a controversy that brought neither honour 
nor advantage to either side. Leibniz, accused 
(quite groundlessly, as we know today) of di- 
rectly borrowing his ideas from Newton, lost 
the patronage of the dukes of Hanover, who 
had long supported him, and died in poverty 
and oblivion. His death was not reported in 
a single German (to say nothing of an English) 
publication; and even the Berlin (Prussian) 
Academy, which he founded, took no notice of 
his death. Only the French Academy, of 
which Leibniz had been an active member, 
paid tribute to him in a eulogy. On the other 
hand, the refusal to recognize Leibniz (and 
also the refusal to recognize his differential 
and integral calculus) substantially hindered 
the progress of English science and completely 
separated English pure and applied mathemat- 
ics from continental mathematics: English 
university graduates were not familiar with the 
terms “derivative” and “integral” or with the 
notation introduced by Leibniz, and hence 
were unable to read books and treatises by 

3 * 20 The Royal Society of London appoin- 
ted a special committee to discuss the contro- 
versy over priority (the committee sided with 
Newton). This committee, naturally, did not 
include scientists whose works it discussed. 
More than that, the first phrase of the com- 
mittee’s report (published under the title 
Commercium Epistolicium in 1712) stated that 
only the absence of the scientists involved in 
the controversy could ensure impartiality. But, 
alas, a copy of the draft of the report written 
in the hand so well known to historians of 
science shows with all certainty that the re- 
port (including the first phrase) was written 
completely by the then President of the Royal 
Society, Sir Isaac Newton. On the other hand, 
the behavior of Leibniz in the controversy 
with Newton can also hardly be considered 
irreproachable. 


German or French scientists. It seems to us, 
however, that the disagreement between New- 
ton and Leibniz was not accidental and deser- 
ves a more detailed examination of its causes. 

The fact that Newton and Leibniz simul- 
taneously and, beyond dispute, independently 
discovered differential and integral calculus 
best of all demonstrates the timeliness and 
inevitability of the great scientific revolution 
of the 17th century. Moreover, the different 
(even opposite) psychic make-up and scientif- 
ic programs of these two outstanding scien- 
tists, which shaped the pathways by which 
they arrived at their discoveries, make a 
comparison of their investigations highly edi- 
fying. 3 - 21 Leibniz, the philosopher, was large- 
ly guided by what he called the “metaphysics 
of infinitesimals” (differentials; incidentally, 
this concept, which came from Fermat, appea- 
red in Newton’s works under another name 
before it did in those of Leibniz). Leibniz was 
carried onwards by the language he had cre- 
ated, by the developed symbols and terminol- 
ogy characteristic of the new calculus. (We 
remind our readers once again of Leibniz’s 
great interest in the science of language: he 
was one of the founders of what is now known 
as comparative-historical linguistics.) 

Moreover, the definition (later included in 
all textbooks) of infinitesimals as variables 
whose limit is equal to zero was undoubtedly 
alien to Leibniz. It corresponded much more 
closely to the scientific thinking of Newton. 
Leibniz regarded infinitesimals as “special 
numbers,” the rules for using which completely 
differed from those governing operations car- 


3 - 21 These differences in approach and 
viewpoint make, we believe, the whole contro- 
versy over priority quite meaningless, since 
now it can be said that the inventions of New- 
ton and Leibniz were entirely different (e.g., 
see A.R. Hall, Philosophers at War: The Quar- 
rel Between Newton and Leibniz, Cambridge 
University Press, Cambridge, 1980). 

Note that analytic geometry was created 
simultaneously and independently by Descartes 
and Fermat; moreover, these two scien- 
tists also belonged to different psychological 
types. In contrast to Descartes, who thought 
largely in terms of physics (and also geomet- 
ry), Fermat can be regarded as an algorith- 
mist similar to Leibniz. This is reflected in his 
far more systematic treatment of the subject 
of analytic geometry. (The “algorithmicity” of 
Fermat’s thinking was most fully manifested 
in his outstanding achievements in the theory 
of numbers, a theory that did not interest 
Descartes in the least; incidentally, Fermat 
had a most profound grasp of physics — he 
discovered a remarkable principle that the 
path of a ray of light from one point to an- 
other, through one or more media, is such that 
the time taken is a minimum. Fermat’s prin- 
ciple played an outstanding role in the further 
development of physics. 
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ried out with ordinary (real) numbers . 3 - 22 New- 
ton, the physicist, however, proceeded from 
the substantive meaning of the new concepts 
and their role in natural science: he regarded 
the derivative as speed, and the reason why he 
did not carry through to the end the theory 
of limits which he had set forth in outline 
was because he saw no particular need for it. 

T in his view,’tne pnysicai meaning 61 ail \ne 
concepts that he introduced fully justified 
them, and he felt no need for formal mathemat- 
ical theories here. 

We can assume that the profound mutual 
antipathy that arose between Newton and 
Leibniz occurred at the time of their first 
(and, apparently, only) meeting. This was 
when Leibniz came to England to demon- 
strate his version of a “mathematical” (to be 
more exact, arithmetical) machine [the first to 
put forward this idea was the outstanding 
17th-century French mathematician and phys- 
icist Blaise Pascal (1623-1662) 3 - 23 : the very 
idea of such a machine ran counter to New- 
ton’s type of intellect]. Still more remote from 
Newton was Leibniz’s dream (and prevision) 
of “logical machines” (actually, the prototypes 
of today’s computers) for mechanizing mental 
processes. On the other hand, Leibniz comple- 
tely rejected Newtonian mechanics (altho- 
ugh he had an exceptionally high opinion of 


3 - 22 Interestingly, these ideas of Leibniz 
received a fully logical substantiation only in 
1960 in what is known as nonstandard analysis 
introduced by the American mathematician 
Abraham Robinson (1918-1974), which has 
already found definite application. 

3 - 23 In some respects Leibniz’s machine sur- 
passed Pascal’s arithmetical machine: it not 
only could perform arithmetic operations on 
numbers but could, for instance, extract 
square roots. 


Newton as a mathematician), because the very 
idea of action-at-a-distance (Newtonian gra- 
vitation) contradicted his general scientific and 
religious views. 

The subsequent evolution of the scientific 
ideas of Newton and Leibniz in European 
science is not without interest. Newton emerged 
the indisputable victor in the debate about 
priority, ’nut ’tne outwarcT lorms oi’dTner- 
ential and integral calculus have come down 
to us entirely from Leibniz. As for scientific 
ideology in the broad sense of the word, this 
was influenced for many centuries to a far 
greater degree by Newton than by Leibniz. 
This is illustrated not only by the above-cited 
lines from Pope and Lagrange. The entire 
history of European science in the 18th, 19th, 
and first half of the 20 th centuries was inspi- 
red by Newtonian mechanics as a true model 
of a scientific theory in the finest meaning of 
the term. And our book, too, owes much more 
to the views of Newton than to the world- 
outlook of Leibniz. 

But in the second half of the present cen- 
tury Leibniz’s dream of “thinking machines” 
has suddenly come true. What is more, to- 
day’s “computerization of knowledge” has once 
again put the algorithmic ideas of Leibniz in 
the forefront of scientific thinking, not to men- 
tion the significance of diverse particular 
achievements of the great German, from the 
elements of mathematical logic he created to 
his serious interest in combinatorics , something 
quite unusual for the 17th century. 

Summing up, we can say that while New- 
ton and Leibniz approached higher mathemat- 
ics from different starting points, the subse- 
quent history of science has confirmed the 
unquestionable value of both avenues of 
thought which they represent, the value of 
both Newton’s approaches and the scientific 
ideas of Leibniz. 




Chapter 4 Calculation of Derivatives 


Convenient and pictorial modes of 
notation for various relationships and 
simple rules that permit carrying out 
computations mechanically without er- 
rors are of great significance both for 
teaching and for the development of 
mathematics as such. The place oc- 
cupied by Descartes and Leibniz in the 
history of mathematics is due not only 
to the mathematical discoveries they 
made but also to the new notation 
developed by these scholars; in partic- 
ular, the language of the calculus of 
differentials developed by Leibniz. 

In Chapter 2 we analyzed the mean- 
ing of the derivative concept and gave 
simple examples of how to calculate 
derivatives. In Chapter 4 we present the 
general rules for finding the derivat- 
ives of various functions: polynomials, 
rational functions involving ratios of 
polynomials, algebraic functions in- 
volving the unknown under the radical 
sign (fractional powers), the exponential 
function, the logarithmic function, trig- 
onometric functions, and so forth. We 
will find the general rules for the deriva- 
tives of various combinations of function 
whose derivatives are known: a sum of 
functions, a product of functions, a 
ratio"of functions, a composite function. 
A table of derivatives of a number of 
functions that summarizes the work 
carried out in Sections 4.1 to 4.13 is 
given in Appendix 1 at the end of 
the book. 

4.1 The Differential 

From the definition of a derivative 
we have the following rule (algorithm) 
for calculating the derivative: specify 
a value of x and its increment Ax, 
find / {x) and / (x + Ax ), find the in- 
crement Af = f (x + Ax) — / (x), form 
the ratio A// Ax, and then pass to the 
limit as Ax -> 0. However, the for- 
mula which yields the general expres- 
sion for Af/ Ax for arbitrary Ax (not 
tending to zero) is, as a rule, more com- 
plicated than the formula for the limit , 


lim = 4— > that is, for the de- 

Aoc-*0 Ax dx 

rivative. 

For this reason we will frequently 
write formulas which are only valid in 
the limit, when the increments tend 
to zero. For small Ax these formulas- 
will be “almost” valid, that is, exact 
equalities are replaced with approx- 
imate ones, and for large Ax they may 
prove to be totally incorrect. To em- 
phasize this we will use the symbols dy 
and dx instead of Ay and Ax. We have* 
to work out rules for handling the quan- 
tities dy and dx , which are called differ- 
entials , 41 so that the basic equation 

j$r -**<*> < 4 -<-‘> 

holds true. 

Note that this equation was already 
discussed in Chapter 2. There, however, 
it was not a true equation but only a 
symbolic way of writing the derivat- 
ive; it meant that the notation y' (x) 
and the notation dy/dx coincide. Now, 
however, we wish to consider (4.1.1) 
as a meaningful equation that connects 
the derivative y' (x) and the differen- 
tials dy and dx of the function y and the 
independent variable x, namely, the 
derivative y' (x) must be equal to 
the ratio dy/dx of the differentials. 

Before, we wrote an approximate 
expression for the increment of the 
function: 

y (x + Ax) — y (x) = Ay ~ y 9 Ax. 

(4.1.2) 

We can say that Eq. (4.1.2) becomes- 
exact in the limit, as Ax -> 0. Here 
the words “becomes exact in the limit” 
do not mean, of course, only that 
at Ax = 0 the left- and right-hand 
sides of (4.1.2) coincide (are equal to- 
zero) — they stress that for very small 


4 - 1 In Latin this word means “difference,”' 
or increment. In this way, Ax and Ay are incre- 
ments, while dx and dy are differentials, that 
is, “almost” increments. 
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values of Ax the left- and right-hand 
sides of (4.1.2) are “almost’’ equal in the 
sense that the difference between them 
is much smaller than the members 
themselves. Indeed, the difference be- 
tween the left- and right-hand sides of 
(4.1.2) for small Ax is of the order of 
(Ax) 2 or even higher (see the end of 
Section 2.4), so that it is indeed (for 
y' = 5 ^= 0, of course) much smaller than 
A y and y' Ax, which are of the first order 
of smallness: even if we divide both 
sides of (4.1.2) by a very small Ax, 
we will nevertheless arrive at an “al- 
most” exact equality. Let us now agree 
to replace (4.1.2) with the exact equa- 
tion 

dy = y' (x) dx . (4.1.2a) 

Let us explain this further. We as- 
sume that the differential of the inde- 
pendent variable , dx , is simply the 
increment Ax (by definition). Then 
we define the differential dy of the func- 
tion y — y(x) by the exact equation 
(4.1.2a), 4 - 2 which can be augmented 
in the following manner: 

dy = y' (x) dx — y r (x) Ax. (4.1.2b) 

Thus, the differential dy of a function 
y = y (x) differs from the increment 
Ay, since for Ay the equality (4.1.2) 
related to (4.1.2b) is satisfied approxi- 
mately, Ay ~ y' (x) Ax, while the exact 
equality for Ay = / (x + Ax) — / (x) 
is of the form 

Ay = y’(x) Ax + <r, (4.1.3) 

where cr is a quantity of a higher order 
of smallness than Ax. It is this fact 
that we have in mind when we speak 
of (4.1.2), or Ay/ Ax ~ y' (x), as being 
“exact in the limit”: the ratio Ay/ Ax 
generally differs from y'(x), but 

lim ~~ = y'(x), since for small val- 

Ax-*0 


4 - 2 Note that while the approximate equa- 
tion (4.1.2) is a theorem , that is, a statement 
that requires proof (i.e. we need to prove that 
- a on the right-hand side of (4.1.3) isof a high- 
er order of smallness than Ay and Ax), 
Eqs. (4.1.2a) and (4.1.2b) are simply defini- 
tions of dy. 


ues of Ax we can ignore the term a 
on the right-hand side of (4.1.3). How- 
ever, the words about the limit as 
Ax ->0 and about ignoring a term in 
an equation become redundant if we 
replace the language of increments 
with the language of differentials; as 
a result Eqs. (4.1.1), (4.1.2a), and 
(4.1.2b) are common (exact) equali- 
ties. 

All this has been discussed in great detail 
in Section 2.4 in connection with the question 
of approximately calculating the values of 
functions (p. 77 on). When we calculate, say, 
the values of the function y = x 2 or y x = x z , 
we are justified to assume that if Ax is small, 
then Ay ~ 2x Ax or A y t ~ 3x 2 Ax, since the 
error introduced by such an assumption is 
negligible. For instance, the approximate 
equality Ay ~ 2x Ax means that the incre- 
ment Ay of the area (y = x 2 ) of the square 
ABCD with a side AB=x (Figure 4.1.1) caused 
by the increment of the length of the side, 
Ax, can be replaced with the sum of the areas 
of the two rectangles BB^C and DD^c^C, 
ignoring the area of the very small square 
Cc 1 C 1 c 2 equal to (Ax) 2 if BB X = Ax is small. 
If we examine the function y 1 — x z and put, 
say, x — 2 and y 1 (x) = 8, we can construct 
the following table of increments Ay l = 
yi (x -J- Ax) — y 1 (x) and differentials dy x = 
yi (x) Ax for this function and the errors 
o/Ay 1 — (A y x — dy-f)/ Ay ± introduced by the 
approximate formula (4.1.2): 


Ax 

1 

0.5 

0.2 

0.1 

A^i 

19 

7.625 

2.648 

1.261 

dy i 

12 

6 

2.4 

1.2 

cr/Ajfi 

37% 

21% 

9% 

5% 


Ax 

0.05 

0.01 

0. 001 

A 2/i 

0.615 

0.1206 

0. 012006 

dy i 

0.6 

0.12 

0. 012 

o/Ai/i 

2.5% 

0.5% 

0. 05% 


This table vividly demonstrates the validity 
of the statement that formula (4.1.2) is exact 
in the limit. 

More information concerning the remain- 
der a in the approximate formula (4.1.2) can 
be found in Chapter 6. There we will prove 
that (T can be represented in the form of a se- 
ries a (Ax) 2 + b (Ax) z + . . ., whose coeffi- 
cients are independent of Ax. But for small 
increments Ax all the powers, (Ax) 2 , (Ax) z , 
etc., are much smaller than Ax, whereby 
| a | = | a (Ax) 2 + b (Ax) z + . . . | is much 
smaller than the differential | dy | = 
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Figure 4.1.1 

| y' (x) Ax | (which is proportional to | Ax |). 
Of course, the last statement becomes invalid 
when if (x) = 0, that is, when the differential 
dy vanishes and hence no small positive | a | 
can be smaller. However, even in this case we 
can employ (4.1.2) and ignore the small error 
cr. We advise the reader to ignore, for the 
time being, cases where a function y = y (x) 
has no derivative (and differential) since these 
are really exotic (e.g. see Section 7.2). 

The idea of a differential and the 
notation dyldx for a derivative were 
introduced by Leibniz, who did not 
give much thought to the content of 
the idea of a differential, since he was 
more interested in the rules of opera- 
tion on differentials, rules that consti- 
tute the essence of Leibniz’s calculus 
of differentials. However, an exact 
meaning can easily be attributed to 
the idea of a differential; moreover, this 
idea can be illustrated by a drawing. 
We can rewrite Eq. (4.1.3) thus: 

Ay = dy + cr. (4.1.3a) 

This leads us to the following state- 
ment: the differential of a function is the 
principal linear part of the increment 
of the function . In other words, the 
differential dy = y' (x) [Ax (= y' (x) dx) 
is a linear function of the increment 
/Ax, while the increment A y generally 
depends on lAx in a complicated man- 
ner (is not a linear function of Ax). 4 - 3 
On the other hand, dy constitutes the 
principal linear part of the increment 


4 - 3 The differentially = y'(x) Azof a func- 
tion depends on two numbers (variables), 
x and Ax, or is a function of two variables (it 
is linear in variable Ax but is generally 
a complicated function of variable x). 



Figure 4.1.2 

/Ay of a function in the sense that for 
Ax small the second term on the right- 
hand side of (4.1.3a) is much smaller 
than the first. 

The reader will recall that the de- 
rivative y'(x) is the tangent of the 
angle a formed by the tangent line 
to the curve y = y (x) and the x axis 
(see Section 2.5). Therefore, if PQ = 
Ax (Figure 4.1.2), then NQ X = 
tan a X Ax = y' (x) Ax = dy, while 
the increment of the function, A y = 
y (x + Ax) — y (x) is depicted by seg- 
ment MiQx- We see that for small val- 
ues of Ax the differential dy differs 
little from the increment A y of the 
function, while for Ax large this, in 
general, is not the case. However, the 
representation of the differential of 
a function in the form of a graph is 
needed for a correct understanding of 
the role that differentials play in ap- 
proximate calculations, a role that has 
been described somewhat differently 
in Section 2.4 (without the use of the 
word “differential”). In this chapter we 
need only the technique for handling 
differentials, the symbolic calculus of 
differentials; we will nowhere use the 
exact meaning of the differential il- 
lustrated by Figure 4.1.2, and the reader 
may simply forget about it. 

The concept of a differential can be 
given a simple and graphic mechan- 
ical interpretation. Let us suppose that 
the abscissa x in Figure 4.1.2 is the 
time variable (it is for this reason we 
have labeled the x axis by ( t )), while 
the ordinate y (or z) is the distance 
covered. We are interested in how the 
distance covered varies with time, 


9-0946 
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y — y (x), or in the customary nota- 
tion of Chapters 2 and 3, z = z ( Z ). The 
idea of a velocity varying constantly 
under forces is not very simple; there- 
fore, in studying motion in the neigh- 
borhood of a fixed moment in time (the 
position of a moving body at this mo- 
ment is depicted by point M on the 
graph of motion), it is convenient to 
assume that starting from this moment 
the velocity ceases to change (this as- 
sumption is equivalent to the hypothe- 
sis that at that moment we “turn off” the 
forces acting on the body, with the 
result that subsequent motion is en- 
tirely by inertia, that is, with a con- 
stant velocity; see Chapter 9). Then, 
starting with that moment x , the ve- 
locity remains constant and equal to 
instantaneous velocity v (#) = dyldx at 
time x (or dz/dt at time t), and the 
distance covered during time Ax will 
be equal to v (#) Ax = y' Ax — dy. 
(In Figure 4.1.2 the uniform motion 
of the body corresponds to the straight 
line Z, while the graph of the real (non- 
uniform) motion corresponds to the 
curve T.) For small Ax (or At ), this 
hypothetical path NQ 1 (= dy or dz) 
differs from the true path (= Ay 

or A z) by an extremely small quantity 
NM X = o whose order of smallness 
is higher than that of PQ (= Ax or 
A Z). 4 - 4 

The rules for using differentials must 
be such that the ratio of differentials 
is equal to the derivative, or (4.1.1) 
is valid. To achieve this, in formulas 
we have to drop all terms proportional 
to (dx) 2 and higher powers of dx. 


4 - 4 It was in this mechanical form that 
Newton's differential, which he named “fluent” 
appeared; Newton arrived at this concept 
earlier than Leibniz, who did not begin to 
speak about differentials until 1675, whereas 
Newton’s variable quantities (or “fluents,” 
as he preferred to say) were in evidence as 
early as 1666. Even earlier than that, approx- 
imately in the late 1630s, Fermat began to 
use differentials, but without giving them any 
special name or sign; it was on these concepts 
that he based the well-known theorem of max- 
imum and minimum points. However, the 
credit for the extensive calculus of differen- 
tials undoubtedly goes to Leibniz. 


Let us consider an elementary exam- 
ple and then compare the increment tech- 
nique with the technique of differentials. 
We take the function y = x 2 . Original- 
ly, we did as follows: 

Ay = (x + Ax) 2 — x 2 

= 2x Ax + (A#) 2 , (4.1.4) 

whence 

^ 2x+ Ax and y' = lim = 2x. 

Ax-*0 A x 

Using differentials, we write 

dy = (x + dx) 2 — x 2 = 2xdx, (4.1.4a) 

where the term (dx) 2 in the right mem- 
ber was dropped immediately. 

The meaning of such operations 
with differentials is quite clear now. 
Of course, Ay = (x + dx) 2 — x 2 = 

2 xdx + (dx) 2 (see (4.1.4); we remind 
the reader that dx and Ax is the same 
thing). When we go over from Ay to 
dy , we leave only the principal linear 
part of Ay, that is, drop all terms that 
involve Ax (or dx) in second or higher 
powers and retain the terms that are 
proportional to the first power of dx . 
These rules of the calculus of differen- 
tials considerably simplify all calcula- 
tions after one gets accustomed to them, 
and in subsequent sections we will 
widely employ them. 

Finally, let us discuss a remarkable 
property of differentials, a property that 
Leibniz knew and rated highly. Let 
us take a function, say y = x 2 . Then 

y' = 2x and dy = 2x Ax. (4.1.5) 

Now suppose that x is not an indepen- 
dent variable but an auxiliary variable 
dependent on an independent variable 
t (this can be the time variable). Will 
(4.1.5) retain its meaning? Of course 
not, since if x = t 2 , then the deriva- 
tive of y = x 2 = Z 4 is equal to 4 Z 3 and 
not to 2x = 2 Z 2 . In the second formula 
in (4.1.5), Ax is now equal to (t + AZ) 2 — 
t 2 = 2tAt + (AZ) 2 , whence 

2xAx = 2 1 2 [2ZAZ + ( Az) 2 ] 

= 4Z 3 AZ + 2Z 2 (AZ) 2 , 

which differs from the expression dy = 
4Z 3 AZ on the left-hand side of the second 
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formula in (4.1.5). But the main rela- 
tionship (4.1.2a) retains its validity in 
the new conditions: here dx = 2tAt, 
so that 2 xdx = 2 1 2 (2tAt) = At 3 At = dy. 

Is this fact accidental? No, it is not. 
Formula (4.1.2a) retains its meaning 
in the case where x is the independent 
variable and in the case where x is 
an (intermediate) function of another 
independent variable, t (cf. Exercises 
4.1.1 and 4.1.2). 

In view of this it is often more con- 
venient to work with differentials than 
with derivatives and increments, since 
there is no need to remember whether 
x is the independent variable or an 
auxiliary function. We will see many 
examples of this fact below. 

The simplest way to substantiate the above- 
formulated property of differentials is to use 
formula (4.1.3), where a is a quantity of 
higher order of smallness than that of dx 
(and dy = A dx). In view of this formula we 
have 

Ay = B dt + a, (4.1.6) 

where B dt = dy, and a is small. On the 
other hand, 

Ax = C dt + P, (4.1.6a) 

where C dt = dx , and p is small, and if the 
derivative dy/dx at the given point (at t = t 0 
or at x = x (t 0 ) = x 0 ) is equal to D, then 

Ay = D Ax + y, (4.1.6b) 

where y is small. But from (4.1.6a) and (4.1.6b) 
it follows that D dx (= DC At) coincides with 
the differential of y = y (t), since (a) it is 
linear in At and (b) the difference 

Ay — D dx = (D Ax + y) — D dx 
= [D (dx + P) + y] — D dx = £>p + y 

is a small quantity (a quantity with an order 
of smallness higher than that of At) because P 
and y are such quantities and D dx is, of 
course, simply (dy/dx) dx. 

The second derivative y" (x) of a function 
y = y (x) was denoted earlier (see Section 2.7) 
as 



Such notation can be justified by a line of 
reasoning similar to the one used in connection 
with the notation y' = dy/dx. 

We call the quantity y" (x) ( Ax ) 2 
(= y" (x) {dx) 2 ) the second differential of y (x) 
and denote it by d 2 y. In this case the notation 
(4.1.7) for the second derivative y" (x) can be 



interpreted as the ratio d 2 y -f- (dx) 2 of the sec- 
ond differential d 2 y of y to the square of the 
(first) differential (increment) of the indepen- 
dent variable x , or (dx) 2 . The second differ- 
ential d 2 y of the function y = y (x) may also 
be depicted on a graph, which offers possibil- 
ities for using the second differential in cal- 
culating values of a function. For the time 
being we will write x Q instead of x and x in- 
stead of x + Ax. We wish to find the para- 
bola n, 

Y = a (x — x 0 ) 2 + b (x — x 0 ) + c, (4.1.8) 
that is the closest to the graph T of the func- 
tion y — y (x), that is, a parabola for which 
the difference Y (x) — y (x) for small x — 
x 0 = Ax is as small as possible. 4 - 5 The problem 
of finding such a parabola (the problem of the 
“best quadratic approximation” (4.1.8) of 
y (x)) coincides with one of the problems we 
will discuss in Chapter 6; there we will find 
that the coefficients in the equation of para- 
bola n (4.1.8) are a = (1/2) y" (x 0 ), b = 
y' (x 0 ), and c = y (x 0 ). In this case the differ- 
ence Y (x) — y\(x) for small Ax is a quantity 
of third order of smallness (or even higher) with 
respect to Ax = x — x 0 . If we depict the 
curve y — y (x), the parabola Y — Y (x), and 
the tangent line l , 

Y x — kx + s, (4.1.9) 

to the curve T (where k = y' (x 0 ) and s == 
y (x 0 ) — kx 0 ) in one drawing (Figure 4.1.3), 
then the straight line x = constant will in- 
tersect T, l , and II at the points N , and 

iVj, with (see Figure 4.1.3) 

QiMi = y (x)—y (x 0 )—Ay, QiN = y' (z 0 ) A x = dy> 

QiNt = y y" (x 0 ) (Az) 2 -j-y' (x 0 ) (Ax) 
d*y+dy. 


4 - 5 This parabola (the osculating parabola 
1 of curve T; cf. Section 7.9) can be described 
is follows. We draw a parabola ^ througn 
hree close-lying points ( x 0 , y 0 ), (x\, yih ana 
x 2l y 2 ) of curve T, that is, we select a, o, 
L nd c in (4.1.8) in such a manner that 
r (x n ) = y 0 , Y (x x ) = yi, and Y\{x 2 ) — 
:hen°, as ? and i s tend to x 0 , the parabofa 
vill tend to n (so that II is the limit of 1^ 
s Xi and x 9 tend to x 0 ). 
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i.e. 

QxN = dy and NN r = — d 2 y 

(in the case at hand d 2 y is negative). 

Figure 4.1.3 provides a persuasive illus- 
tration of the advantage of the approxi- 
mation Y = y 0 + ,dy + (1/2) d 2 y (= y (^ 0 ) + 
y' (x 0 ) Ax + (1/2) y" (x 0 ) Ax 2 )' for the func- 
tion y { x ) over the approximationTT == z/ 0 + 
dy (= y (x 0 ) + y' (x 0 ) Ax): in the second case 
point M 1 in Figure 4.1.3 is replaced with 
point TV, while in the first it is replaced with 
N x , a point that lies closer to M x than N. 

The second differential d 2 y of a function 
y = y (x) can be given, just as in the case 
with the first differential, a certain mechanical 
interpretation. We again assume that the 
axis of abscissas in Figure 4.1.3 is the time 
axis t and the axis of ordinates (the z axis) is 
the distance axis. Now suppose that starting 
from a certain time (corresponding to point M 
of T) the body (whose motion we are studying) 
moves with uniform acceleration', this assump- 
tion is equivalent to the hypothesis that 
the forces acting on the body cease to vary at 
that point in time, since a body is uniformly 
accelerated if the forces on it are constant (see 
Chapter 9). In this case parabola II becomes 
the graph of motion and the (hypothetical) 
path traversed by the body in time Ax (or A*) 
will be equal to dy + (1/2) d 2 y (or dz + 
(1/2) d 2 z), which in Figure 4.1.3 is depicted 
by segment Q±N V 


Exercises 


4.1.1. Prove that 


dy 

dt 


dy dx 
dx dt 


(a )y = x 2 and x = Y t , and (b) y = Y x and 
x= t 2 . 


4.1.2. Suppose that y = x 2 , with x = 
V*-j-l , an d t — u 2 - Prove that 
dy 


dy = ^ dx= dt 


dt = du . 
du 


4.2 Derivatives of a Sum and 
of a Product of Functions 

As another example of how to use the 
language of differentials we take the 
sum of two functions, / (a:) and g (x), 
taken with (arbitrary) constant coef- 
ficients A and B : 

y = Af (x) + Bg (x). 


Using differentials, we can write 
dy ~ y (x + dx) — y (x) 

= [Af (x + dx) + Bg (x + dx)] 

— [Af (x) + Bg (*)] 

= A [f (x + dx) — f (a:)] 

+ B [g (x + dx) — g (#)] 

Adf + Bdg = Af'dx + Bg'dx , 
whence 

y'= J £r =A f'+ B z'- ( 4 - 2 - 1 > 

The reader can easily arrive at this 
formula if increments and limits are 
used instead of differentials. 

In particular, for the sum and differ- 
ence of two functions (A = B — 1 and 
A = 1, B = — 1) we have 

(/ + gy = r + g',\ if-gy =r - g'- 

(4.2.2) 

Let us now find the derivative of a 
product of two functions, g {x) and 
h (. x ). We put f (x) = g (x) h (x). Then 


df (a;) / (x + dx) — / (x) 

— g (x + dx) h (x + dx) 

— g {x)h (a:). 

But 

g {x + dx) ~ g (x) + dg, 
h (x + dx) h {x) + dh. 


Whence 

df ~ [g (a:) + dg] [h (x) + dh] 

— g {x)h {x) = g {x) dh 
+ h (x) dg + dgdh. 

Note that dg == g' (a;) dx and dh — 
h' {x) dx, whence dhdg = 

g' (a;) h ' (a:) (dx) 2 . The quantity dhdg 
is proportional to (dx) 2 and so, accord- 
ing to the rules for handling differen- 
tials, we ignore the product dhdg in 
the expression for df. Finally we get 


df = g (x) dh + h (x) dg. (4.2.3) 

Dividing all the terms in (4.2.3) by 
dx, we get 


df __ d ( gh ) 
dx dx 


dh l7 dg 

■ 8 h h—£- 

5 dx [ dx 


dx 


(4.2.4) 
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The expression is remembered in the fol- 
lowing manner: the derivative of the 
product gh is equal to the sum of the 
derivative taken on the assumption 
that h is a function of x while g is held 
constant (the term g (dh/dx)) and the 
derivative taken on the assumption 
that h is held constant and only g de- 
pends on x (the term h (dg/dx)). Here, 
naturally, the constant value of g 
in the term g (dh/dx) is taken for the 
x for which the value of the derivative 
is being sought. The same goes for h 
in the second term. Below we will see 
that a similar procedure can be applied 
in other situations, 4 6 for instance in 
the case of y = / (x) g ( x K 
How would we have handled this 
in the old way? Simple algebra yields 
the exact equation 

A/ = (g + a g) (h + Ah) — gh 
= gAh + hAg + AhAg. 


Dividing both sides by Ax, we get 


-g~fM-£+*w£+£&*«. 


(4.2.5) 


Note that the last term for the sake 
of convenience we multiplied and di- 
vided by Ax. 

Up to now all the equations are exact 
and hold true for all values of Ax. 
Now pass to the limit as Ax ->0. Then 


lim 

Ax->0 

lim 

Ax-*0 


Ax 

A £ 
Ax 


= /' 


lim 

Ax-* 0 


Ah 

Ax 


= h', 


and, in view of (4.2.5), 

K = -h *4*2-6} , 

In passing to the limit, the last term 
on the right-hand side of (4.2.5) va- 
nished since the first two factors yield 


4 * 6 It is clear that the formula for the 
derivative of the sum of functions (if F = 
/ + g, then F' = /' + g') also obeys this 
rule, since if / is assumed constant, the deriva- 
tive F' is simply g', while if g is constant, 
the derivative F' is /'. 


in the limit the product h'g r and we 
sent Ax to zero. 

Using increments and passage to the 
limit, we obtain the same result as that 
obtained with the aid of differentials, 
but it takes more time. This is not 
surprising since in the case of differen- 
tials we dropped dhdg mechanically, on 
the basis of an earlier acquired rule 
according to which we have to reject 
terms involving (dr) 2 , (dr) 3 , etc., hence, 
any products of two, three, or more dif- 
ferentials. When carrying out the com- 
putations with the aid of increments, 
we actually, in the very process, proved 
this rule once again for the case of a 
product of functions. 

The succession of operations using 
increments is needed to justify the 
rules and to understand them. But once 
they are understood, the use of differ- 
entials is faster and more efficient. It 
would be silly every time to start from 
rock bottom in solving a specific prob- 
lem and to write out in full that the 
derivative is the limit of a ratio, and 
so on. 

Example. Suppose that / (x) = 
(3x 2 + 5) (2x — 4). Find /' (x) and, 
in particular, /' (0). 

Here 

g = 3x 2 + 5, i| = 3x2* + 0 = 6.r; 
h = 2x— 4, i£-= 2-0 = 2. 

’ dx 

Therefore 

=( 3 * 2 + 5 ) 2 + ( 2 * - 4 ) 6 * 

= 18* 2x -24*+10, 


and in particular 


/' (0) = if 


I 


= 10 . 


The rule for finding the derivative 
of a product generalizes to the case 
of many factors. For example, for the 
product of four functions / (x), g (x), 
h (x), and k (x) we get 
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Thus, here we once more encounter 
the same rule, according to which the 
derivative of a product of several func- 
tions can be found as the sum of the 
derivatives computed on the assump- 
tion that each time only one function 
varies while the other remains constant. 

The formulas we have obtained en- 
able making further conclusions. It 
is clear that if f (x) = g (x) + h (x) 
and f'{x) = g'(x) + h'(x), then 

/" (*) = (/' (*))' = (§’ (*) + h' (*))' 

= g"(x) +h"(x). (4.2.8) 

The formula for the second deriva- 
tive of a product of functions is somewhat 
more complicated: if f (x) = g (, x ) h (x) y 
then /' {%) = g (x) h' (x) g' (x) h (x) 
and /" (x) = 0 g ( x ) h‘ (x) + g' (x) h (*))' = 
(g (a:) h" (x) + g’ (x)h‘ (x)) + (g' (x) x 
h' (x) + g" i x ) h (#)), that is, 
f" (*) = g {x)h" (x) + 2 g’ (x) h’ (x) 

+ g"(x)h(x). (4.2.9) 

Exercises 

4.2.1. Use the rules found above to cal- 
culate the derivatives of the following fun- 
ctions: (a) y = x* (= A 2 ), (b) y = ax 5 + 
bx 4 + cx 3 + dx 2 + ex + /, and (c) y = Y x * 
using the identity Y x Y x = x. [Hint. Use the 
following equality: y'y + yy' = x r = 1.] 

4.2.2. Express the third derivative /"' (x) = 
(f" (x))' of the function f (x) — g ( x ) h (x) in 
terms of the functions g and h and their deriv- 
atives. 

4.2.3. Suppose that / (x) = g (x)h (x)k (x). 
Find f (x). 

4.3 The Composite Function. 

The Derivative of the Fraction 
of Two Functions 

Let z be given as a function of y, say 
2 = 1 ly, and y as a function of x , say 
y = x 2 5. It is clear that to each x 
there corresponds a definite y and 
o .inft5b + 6U '\5u\hi 7 y wnsjpmdfb a* 

definite 2 , we see that each x is associat- 
ed with a definite 2 , that is, 2 is a func- 
tion of x. It is always possible, by sub- 
stituting the expression of y in terms 
of x, to write directly formulas for 
2 (#); in the given example, 2 = 
(x 2 + 5) _1 . 


Finding derivatives is simplified if 
we reduce all functions to combinations 
of the simplest possible functions: sep- 
arately, each of the functions 2 = 

1 ly and y = x 2 + 5 is simpler than 

2 = (x 2 + 5) _1 . By reducing compli- 
cated functions to combinations of 
simpler functions, we are able to get by 
with the rules for finding the deriva- 
tives of these simple functions. 

Let us find the differential of the com- 
posite function 2 [y (#)]. Regarding 2 
as a function of y, we write 

dz^^-dy (4.3.1) 

(compare this with what was said in Sec- 
tion 4.1 about the differential of a 
composite function). But y is a func- 
tion of #, whereby 

dy=z! al dx - (4.3.2) 

Substituting (4.3.2) into (4.3.1), we 
obtain 

dz = ^^ dx - < 4 - 3 - 3 ) 

Dividing both sides by dx , we get a 
rule (known as the chain rule) for deter- 
mining the derivative of a composite 
function : 4 - 7 

dz_ = dzdy_ (4.3.3a) 

dx dy dx 

The form of the formula is in full 
accord with what was stated about the 
possibility of handling differentials as 
ordinary algebraic quantities: we can 
cancel out dy in the product ( dz/dy ) 
(dy/dx). 

Recall that 2 is given as a function 
of y, and so dz/dy is also a function of 
y. But since y itself is a function of x , 
it follows that by substituting y = 
i y ( ^ ) for* dzlrtiy, 

we get dz/dy as a function of x and, 
hence, also dz/dx as a function of x. 


4 - 7 The prime notation z* instead of 
dz/dx can lead to confusion. When we write z', 
it is not clear whether we mean dz/dx or 
dz/dy. (However, we could have written 
or zfj.) 
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Let us carry out computations for a 
case that will be needed later on. Sup- 
pose 


z 


1 

y(x) 


(4.3.4) 


We know that if z = l/y, then dz/dy = 
— 1/i/ 2 and so 


dz = 
and 


■dy = 


1 dy 


y 2 dx 


dx 


dz 1 dy 

dx y 2 dx 


(4.3.5) 


For example, if y = x 2 + 5 and z = 
1 !y = (x 2 + 5)~\ then 
dz 1 d(x 2 -\- 5) 2x 

dx~ (x 2 + 5) 2 dx ~~ (z 2 + 5) 2 • 


Here is another simple example: 
y = x 3/2 = ]/^ 3 . This formula can be 
rewritten as y = Y u * with u = x 3 . 
According to (4.3.3a) we have 

dy _ dy du 1 3a: 2 

dx “ da cte 


£ 

2 


V*. 


Earlier we arrived at the same result 
by a more complicated method (see 
Exercise 2.4.3). 

The chain rule (4.3.3a) for forming 
the derivative of a composite function 
holds true for a more complicated rela- 
tionship as well. Suppose that z = 
2 (y), y = y (*)> * = X (*)> and f = 
t ( w ). Then 

dz dz dy dx dt (A o 

dw dy dx dt dw ' ( • • / 

Using (4.3.5), we can easily find the 
derivative of a fraction (quotient, ra- 
tio) of two functions, / = hlg . To do 
this, we write / in the form of a prod- 
uct: / = h (1 Ig). Then 

/' = My)' + /.'y. < 4 - 3 - 7 > 

But (1/g)' = -(1/g 2 ) g’ (see (4.3.5). 
Substituting this result into (4.3.7), 
we get the sought result 


or 4 - 8 

<4 - 3 ' 8) 


Exercises 


4.3.1. Find the derivative of 2 = (ax + b) 2 
as that of the composite function 2 = y 2 , with 
y — ax + b. Remove the parentheses in 
(ax + b) 2 and find the same derivative di- 
rectly. 

4.3.2. Find the derivatives of the fol- 
lowing functions: 

(a) Z= ax+6 ’ (b) Z== (ax+bf ' 


(c) 2 = 


1 

1 + 1/a: ’ 


(d) y = 


x 3 5a: 2 
x+1 


(e) y= 


X — 1 
X 2 + 2 


4.3.3. Suppose that y — f (x)/g (x). Ex- 
press the second derivative d 2 y/dx 2 of the func- 
tion y in terms of the functions / and g and 
their first and second derivatives. 


4.4 The Inverse Function. Parametric 
Representation of a Function 

Specifying y as a function of x signifies 
that to each x there corresponds a def- 
inite value of y. Hence, conversely, 
we can often say that with each definite 
y there is associated an x. For instance, 
if y = 8x 3 — 9, then x 3 = (y + 

9)/8 and x =\/^=j/ *±2. Thus, 

the specification of y (x) also yields the 
functional relation x (y). This relation- 
ship is called the inverse function , as 
we have already said in Section 1.6; 
on the other hand, y (x) is the inverse 
of * (y). 4 - 9 

In many cases the inverse function 
has a more simple form than the initial 
function; for instance, the function 
y= Yx — 1 contains a cube root, 
while the inverse function x = y 3 + 1 
is a polynomial, which is simpler 


4 - 8 1 1 is clear that formula (4.3.8) corre- 
sponds to the general principle formulated in 
Section 4.2: (h/g) f = h'/g + h (1/g)'. 

4 - 9 In the general case the function x — 
x (y) that is the inverse of the given function 
y — y (x) is multiple-valued (see Sections 
1.6 and 4.11). 
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than y (. x ). In this case, it is simpler and 
easier to find the derivative of the in- 
verse function, dxldy , than it is to find 
the derivative of the initial function, 
dyldx . The problem now is to express 
the derivative of the initial function 
in terms of the derivative of the in- 
verse. For y (. x ) we have 

d y = -%-dx=y'{x)dx. (4.4.1) 


This enables finding the derivative 
x (y) of the inverse function x (y): 


(y) 


dx 1 

dy ~ y' (x) 


(4.4.2) 


(Of course, (4.4.2) once more illustrates 
that differentials can be treated as 
numbers, or dxldy = 1/ {dyldx).) On the 
right is an expression in the form of a 
function of x . But if we know x = 
x (y), this expression can be repre- 
sented as a function of z/, namely, as 

W (* (*/))• 

A few examples will serve as illus- 
trations of the foregoing. The first 
example (a linear function) is too sim- 
ple. We start with the second example: 


y=x 2 > ■% = v' ( x ) = 2x < 

j l = _ L _ =± < 4 - 4 - 3 ) 

dy y' {x) 2 x 

Substituting into (4.4.3) the inverse 
function x = l/y, we get 


dx _ dVy = 1 _ 1 

dy ~~ dy ~~ 2x 2 Yv 


(4.4.4) 


If a function is represented paramet- 
rically (see Section 1.8), this represen- 
tation may be regarded as a special 
case of a composite function. Indeed, if 
it is given that 

x = f {t), y = g (i t ), (4.4.5) 

then the first of these equations may be 
regarded as an equation whose solu- 
tion yields t = t (x). Substituting this 
t (, x ) into the second equation, we get 

y = g (t) = g {t (x)). Hence, 

dy dy dt 

dx dt dx 


But to use this formula one need not 
express t as a function of x (if we were 
to do this we would get rid of the para- 
meter, but this is not always possible). 
It suffices to know x = / (£), which is 
the inverse of t (. x ). Thus, 


dt _ 1 

dx ~ dx 7 
dt 

and so 


dy 

dy _ dt 
dx dx 
~dt 


(4.4.6) 


This formula is yet another instance 
showing that we can handle differ- 
entials like ordinary algebraic quan 
tities: the quantity dt in the right mem- 
ber of (4.4.6) is canceled out. 

Here is an example . Suppose that 


x — t 2 — t , y = t 2 + t. (4A7) 


In Chapter 2 we obtained the same re- 
sult (the derivative of y = Y x ) a 

more roundabout fashion. 

Here is a third example: 

y — x 3 + \, jjL = y’( x ) = Sx 2 , 

dx _ 1 1 

dy y' (x) ~ 3 x* 9 

dx _ d (Y y — l) 1 _ 1 

dy ~ dy ~~ 3 x 2 ~~ 3 ^_!)2 

=4o/-i)- 2/3 

(cf. Section 4.5). 


Then 

§ £-*+*• 

dy 2^-f-l 

dx 2 1 — 1 

In constructing the graph of a 
function represented parametrically, 
x = x (t) and y = y ( t ), we set up a 
table where we specify the values of x 
and y corresponding to the chosen va- 
lues of t, this set of x and y specifying 
a point M = M (x, y) on the graph of 
the function. It is also convenient to 
calculate for a given t the derivatives 
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dxldt and dy!dt\ then the ratio 
(dy/dt)/(dx/dt) will fix the slope of the 
tangent to the graph at point M (. x , y). 

Formulas (4.4.2) and (4.3.5) also make it 
possible to arrive at formulas for the second 
derivative of a function x = x (y) that is 
the inverse of another function y = y (x). Of 
course, following the line of reasoning that 
led us to (4.4.2), we would like to think that 
x"(y) = t/y"(x)— but this is not so if only 
because of dimensionality considerations. It 
is clear that if x and y are expressed in units 
of e 1 and e 2 , the derivatives y'(x) — dy/dx 
and x'(y) = dx/dy have the dimensions of 
eje x and eje 2 , respectively, so that the expres- 
sions on both sides in (4.4.2) have the same 
dimensions. However, the second derivatives 
y"(x) = d 2 yldx 2 and x"(y) = d 2 x!dy 2 have 
the dimensions of eje\ and eje\, in view of 
which x"(y) (whose dimensions are those of 
eje\) can in no way be equal to i/y"(x) 
(whose dimensions are e\/e 2 ). 

The true formula for the second derivative 
of x (y) can be written as follows: 


d 2 x 

d I dx \__ d 1 

1 ) 

dy 2 “ 

dy \ dy ) ~ dy \ 

y’ (*) 1 



~dx^ V {X)) ^ 


(y’ (x)) 2 

(y’ (x)) 2 


«•<*> [V(t 

)1 


(y 1 (x)) 2 



»'(*)(!/»' (*)) 



(y' (*)) 2 


function y = y (x) specified parametrically* 
x = x (t) and y = y (t): 


d 2 y _ 
dx 2 

d / dy \ 
dx \ dx ) ’ 

where x=x(t). 

dy _ 

y'(t) 


dx 

x' (t) • 


Hence, 



d 2 y 

d 1 y' ( t ) 1 
dt \ x ' (t) ) 

| y" (t)x' (t)-y' (t)x" (t) 
’ (x 1 (, t))» 

dx 2 

x (t) 

x’ (t) 


that is, 

d 2 y __ y" (t) x f (t)-y' (t) x" (t) 
dx 2 (x' (t)) 3 


(If x , y, and t are measured in units of e 1 , e 2r 
and e, then the derivatives dxldt and dy/dt have 
the dimensions of eje and e 2 /e, so that the 
fraction on the right-hand side of (4.4.6) has 
the dimensions of (eje)!(eje) = eje^ that is, 
dimensions of the right-hand side of (4.4.6). 
On the other hand, the second derivatives 
x" ( t ) and y" (t) have the dimensions of eje % 
and eje 2 , so that the products in the numera- 
tor of the right-hand side of (4.4.9) all have 
the same dimensions, those of (eje 2 ) (< eje ) = 

( eje 2 ) (e 2 /e) = (e-^ej/e 3 , with the dimensions of 
the entire right-hand side being those of 
(eyejte 3 (eje) 3 = eje f, that is, coinciding 
with the dimensions of the second derivative 
d 2 y/dx 2 .) 

For instance, in the case of function (4.4.7> 
we have 


Thus, finally, 

x " (v)= y " ^ ■ ■ (4 4.8) 

{y) (y' (*)) a • K ’ 


(It is clear that if x and y are measured in 
units of e x and e 2 , respectively, then the right- 
hand side of (4.4.8) has the dimensions of 
(eje\)l(ejej 3 = eje §, that is, dimensions that 
coincide with those of the left-hand side 
of x" ( y ).) 

For instance, if y = x 2 , x = y y, then 
y' = 2x, y" = 2, and, hence, 


x"(y)= 


d 2 Wj) 

dy* 


1 

4 y 3 / 2 


2 

(2x) 3 


1 

Ax 3 


Similarly it is easy to use (4.4.6) and de- 
rive a formula for the second derivative of a 


d 2 x _ d 2 y 
dt 2 ’ dt 2 ~ ’ 
d 2 y 2 (2t — 1) — (2£ + l) 2 
dx 2 ~ (2t — l) 3 

4 

~ (2t — l) 3 * 


Exercises 


4.4.1. Find first derivatives of the follow- 
ing function: (a) y = y^2x-J 1 , (b) y = 

]/ » an d (c) y = l/y r x. 

4.4.2. x = acost, y = bsint. Find the de- 
rivative dy/dx. 

4.4.3. Find the second derivative d 2 yidx t 

f x 1 

of the following functions; (a) y=y ~~ q ' ~\ m r 
(b) y = l/y~x, and (c) x = acost, y = bint. 
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4.5 The Power Function 

Let us consider the power function, 
y = x n , (4.5.1) 

where n is a constant number. For n 
a positive integer, x n is the product of 
72 identical factors, y — x-x-x . . . x, 

n factors 

and, hence (via formula (4.2.7)), 
-^=l-x n ~ i +l-x n ~ l + . . , + 

n terms 

whence 


dy 

dx 


= nx 


n - 1 


(4.5.2) 


We will show that this formula holds 
true for arbitrary 72, whether fractional 
or negative. 410 

For fractional n we write n = m/p, 
where m and p are integers: 

y = x m l p , (4.5.3a) 

or 

y v = x m . (4.5.3b) 

The expression y v on the left-hand side 
of (4.5.3b) is a composite function of 
x , since y depends on x . Therefore, cal- 
culating the derivatives of both sides 
and employing (4.3.3a), we obtain 

w ^ - or py p ~ 1 % = 

whence 

m 

dy m x 771 - 1 m x m ~ 1 m m p “ 3 
~dx “ 7~ = ~/T (x m IV)V- 1 

Noting that mlp — n, we finally 
have 



For a negative exponent we write 
n = —k, where k is a positive inte- 
ger, so that y — x n = x~ h = ilx h . 
Using again (4.3.3a) (to be more pre- 


4 - 10 The reader will immediately see that 
it is valid at n = 0, when y = constant and 

= o. 


cise, (4.3.5); here y = 1 If and / = x k ), 
we find that 


dy 1 df 1 

dx f 2 dx x 2k 




Recalling that k = — 72, we find that 
for 72 negative 


dy 

dx 


dx n 

dx 


nx 


,n-i 


which coincides with (4.5.2). 

Thus, the formula for the derivative 
of a power is applicable to any ration- 
al exponent 72. It can also be extended 
to the case of an irrational exponent. 411 
Formula (4.5.2) is of greatest impor- 
tance. It implies, for one, that for r 
small in absolute value we can write 


(1 + r) n 1 + nr. (4.5.4) 

Indeed, if x = 1, A x = r, and y = 
x n , then Ay — (x + Ax) n — x n — 
(1 + r) n — 1. On the other hand, 
dy = y r (. x ) Ax = 72- 1 71 "" 1 - r = nr. For- 
mula (4.5.4) follows from the fact that 
(for Ax = r small) the increment Ay 
is close to the differential dy. 

It is also expedient to write (4.5.2) 
in another form: 


if y = cx n , then -^|*=72-|-. (4.5.5) 

One has to get the feeling of this result. 
For positive n , the power function has the 
obvious property that for x = 0, y is also 
equal to 0. For a given n >» 0, the curve y = 
cx n can be drawn through any point (£ 0 , y 0 ), 
just choose c = yjx ft. Let the curve pass 
through the origin and through the point 
(x 0 , y 0 )• We wish to find the average value of 
the derivative over the section of the curve 
between the origin and point ( x 0 , y 0 ). Accord- 
ing to the definition of an average (see Sec- 
tion 7.8), 


x 0 



4 - n The function y = x n , where n is an 
irrational number, is defined by the condition 
x n = lim x r , where r is a rational number 
and lim r = n, whereby if formula (4.5.2) 
is valid for all rational exponents n, it will 
necessarily be valid for irrational exponents. 
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whence, using (3.4.9), we get 
y(xo) — y{Q) _ yM __ y 0 

Xq Xq Xq 

Indeed, as x varies from 0 to x 0 , the value of y 
grows from 0 to z/ 0 . Hence, the average rate of 
'g'ny^hi ^5x 'y '(ire. "average drerivdiiv^; ; r» 
equal to yjx 0 , a result obvious without inte- 
grals. 

As evident from (4.5.5), the value of the 
derivative at point (x Q , y 0 ) is n times the aver- 
age value of the derivative (n is the exponent). 
Figure 4.5.1 depicts a number of curves with 
different n (n = 1/2, 1, 2, 5) passing through 
one and the same point N ( x 0 , y 0 ) and, hence, 
having the same average derivative on the 
interval from 0 to x 0 . It is clearly evident that 
the larger the n, the greater the derivative at 
point N (the more steeply the curve rises). 

Let us return to formula (4.5.5), 
dyldx = n (ylx), whence dy = n ( ylx)dx , 
and so for small increments Ax we 
have 

Ay ~ 72 -|- Ax . (4.5.6) 

For Ax = O.Olx, that is, for a 1% 
variation in the independent variable, 
formula (4.5.6) can be assumed exact. 
Then 

Ay ~ n - X 0.01.r, 

or Ay ~ n x 0.01 y. 

Thus, when the independent variable 
varies by 1%, the power function 
(4.5.1) varies by 72 %. 


Exercises 

4.5.1. Find the derivatives of the follow- 

ing functions: (a) y = x 5 — Sx^-^-x^-^-lx 2 — 
2x + 5, (b) y = (a ; 3 + x + L 2 > ( c ) V = 

(x 2 — x + l) 4 , (d) y = (3x 2 — l) 10 , 

(e) y =\ r x 2 — 1, and (f) y = f / ’z 2 . 

4.5.2. Obtain formula (4.5.2) for n a po- 
sitive integer with the aid of the binomial theo- 
rem. [Hint. Use the binomial theorem (Sec- 
tion 6.4) to calculate Ax = (x + Ax) n — x n .\ 

4.5.3. Find the values of y (9) and y (11) 
if y (10) = 5 and (a) yoc Y x, (b) y oc 1 / x , 
and (c) y OC x 2 , where stands for proportion- 
al. Solve the problem mentally without any 
computations. Compare the answers with the 
exact values. 


4.6 Derivatives of Algebraic Functions 

The collection of rules in Sections 4.1- 
4.5 enable finding the derivative of any 
function involving addition, subtrac- 
tion, multiplication, division, and rais- 
ing to an (arbitrary) power including 
fractional powers, i.e. extraction of 
roots, of the independent variable x. 

An example will illustrate how to do 
+ hiib n ‘m + hib hretf l ’prailinrdA ^mh 

the derivative of the function 


/ {x) = x 3 /"x 2 — 1 . 

It is best to write the answer at once, 
that is, without introducing any new 
designations (such as 3 / x 2 — 1 = y). 
The derivative is taken, as it were, sep- 
arately with respect to each side involv- 
ing x, and we say roughly the follow- 
ing (the letters “a,” “b,” “c,” “d” show 
to what portions in the expression of 
the derivative the words refer): the de- 
rivative of (a) with respect to x in 
front of the radical sign plus (b) the 
derivative with respect to Y x 2 — 1 
multiplied by (c) the derivative 
Y x 2 — 1 with respect to x 2 — 1 mul- 
tiplied by (d) the derivative of x 2 — 1 
with respect to x: 


(a) 


dx r 


(b) 

+ X X 


(c) 

1 1 

1 y**— 1 

3 x* — l 


(d) 

2x. 


(4.6.1) 
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It is well to get used to this efficient 
approach (without indulging in a lot of 
writing) by applying the following 
principles: 

(a) The rule for differentiating a com- 
posite function (Section 4.3, formulas 
(4.3.3a) and (4.3.6)). 

(b) If the expression is made up of 
several functions, then its derivative 
is equal to the sum of the derivatives 
computed on the assumption that each 
time only one of the functions is taken 
to be variable and the rest are held con- 
stant (see Section 4.2, formulas (4.2.4), 

(4.2.7) , and also (4.2.1), (4.2.2), and 

(4.3.8) ). 

The formula for the derivative of a 
power is conveniently used in the form 


x 2 — 2 


(i + x 2 ) V i+x 2 


4.6.16. y = xj/( 2x + 3) 2 . 


4.6.17. 


y 


l. 


4.6.18. y = 

lEi 

c+i 


(x 3 — 1 ) Y X — 1 + x f/1 

xj/_ (2x 3) 2 _ 4.6.19. y ••= 


(* — 1)* 

/Hr- 4 - 6 - 20 - »- *’+f+‘ yi+r. 


4.6.21. y = 


__ xY_ z 2 — 1 

x 3 + i 


4.6.22. y = 


Y • 4 - 6 - 23 - y= x ^* 2 - 1 

V Y~~x 4.6.24. y — ( x + -t-V' 1 

v y x > 

4.6.25. y 


X 


(1+ a :) 1 / 3 


as was done above (see portion “c” in 
(4.6.1)). 

To acquire the necessary facility in 
handling these rules, a good 10 to 20 
exercises devoted to the pure techniques 
(without regard for the physical or 
geometrical content of the problems 
involving finding derivatives) are def- 
initely needed. 


Exercises 


Find the derivatives of the following func- 
tions: 4.6.1. y = x s (x 2 — l) 2 . 4.6.2. y — 
x 3 Y x2 + x. 4.6.3. y = x b Y x2 — 1 (a: 3 — 2 x) 1 / 5 . 

4.6.4. y=(x - 1 jjrr-) Y — 2. 4.6.5. y = 

V y x ' 

x 2 Y Y x + x- y — (^Yx + -at- j 


4.6.7. y = 


-. 4.6.8. y-- 


4.6.9. y = 

3a: — 1 


1— a; 2 

(x-l)(x + 3) 


x+i 

- b Y^+ 2- 4.6.11. y 


Y~x 
_ x 2 +x + i 

X 2 — X 4 - 1 

4.6.10. y = 

x 


4.6.12. y = 


Y x +i 


Y x2 — 1 

. 4.6.13. y— Y- x 2 -\-xY x - 


y = Y x 2 + V x • 4.6. 


15. 


y = 


4.7 The Exponential Function 

Consider the function y = a x , with 
a > 1. The graphs of this function for 
a = 2 and a = 3 are shown in Figure 
4.7.1a. When x = 0, y = 1 for arbit- 
rary a (all graphs pass through the 
point (0, 1)). 

For all x , the function y is positive 
and grows with increasing x so that the 



4.6.14. 


Figure 4.7.1 
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derivative is everywhere positive as 
well. When x is increased by a constant 
quantity c , we get y (x + c) — a x+c = 
a c a x — ba x = by (#), with b = a c , 
that is, the quantity y is multiplied by 
a constant quantity. Thus, if x is var- 
ied in successive fashion, by indentical 
steps (in arithmetic progression ), 

x == x 0 , £ 0 + c, x 0 +2c, . . ., x 0 +nc, ..., 

then y will assume the values 
*/o> t>y 0l 6 2 y 0 , . . ., b n y 0 , . . .. 

It will be recalled that such a law of 
growth is called a geometric progression. 

Let us find the derivative of the ex- 
ponential function for a = 10. 4 - 12 We 
can write 


Thus, we have 


(where, of course, the constant on the 
right-hand side has been determined 
only approximately, albeit reliably). 
Hence, 

^-(10 x ) ~ 10 x x 2.3. (4.7.1) 

We found the derivative of 10* in 
experimental fashion, so to speak, by 
means of an arithmetic experiment. 
For any other exponential function it is 
now easy to reduce the problem to the 
preceding one. Using the concept of a 
logarithm, we write 

a = 10 log a , a x = 10 x ' losa . (4.7.2) 


d( 10*) _ 10*-t-ds_lQ*: _|QX 10*«-1 
dx dx dx 


By the rule for finding a derivative of a 
composite function, we get 


where dx (instead of Ax) emphasizes the 
passage to the limit as Ax 0. 413 

What is the quantity (10 dx — 1 )/dx? 
It is the limit of the ratio (10** — 1)/ 
Ax as Ax 0. Let us find this limit 
numerically, arithmetically. Using a 
four-place table of logarithms or a pock- 
et calculator with a y x key, we get 

10 01 ~ 1.2589, ~ 2.589 ; 

100-01 ~ 1.0233, 1H__ — L /v/ 2.33 ; 

/jnO.OOl a 

IQO.ooi ~ 1.0023, om ~ - 2.3. 


4 * 12 a = 10 is taken simply to facilitate 
computation involving a table of base-10 
logarithms. The reader that owns a pocket 
calculator with a y x key can write out the 
values of, say, 2 0 - 1 , 2 0 - 01 , ... or 3 0 - 1 , 3 0 - 01 , . . . 
and the ratios corresponding^ to these val- 
ues: (2 0 - 1 — 1)/0.1, etc. 

4 ’ 13 It is this fact that makes us write 

d(10*) l0*+d* — 10^, 

~dx — = ~dx ^ an exact equ a tion) instead 

, d(10)* 10 x+Ax — 10* . 

dx “ ~kx 811 a PP rox i ma i e 

equation). 


~ 10 x log a X 2.3 log a 
= a x X 2.3 log a. (4.7.3) 


The remarkable peculiarity of the 
exponential function is that its deriva- 
tive is directly proportional to the func- 
tion itself : 


da x 

dx 


ka x , 


(4.7.4) 


where the proportionality factor k de- 
pends on base a. Therein lies the chief 
property of a geometric progression: 
the greater the quantity, the faster the 
growth. 4 14 

If 0 < a < 1, the graphs of the ex- 
ponential function will be as shown in 
Figure 4.7. ib. When x increases in arith- 
metic progression, y diminishes in 
geometric progression. Formula (4.7.3) 
is still applicable, but now log a is 
negative (since a < 1) and, hence, the 
derivative, which is proportional to 
the function, is of opposite sign. In 

4 . 14 This property of geometric progres- 
sions, their exceedingly fast rate of build-up, 
is a favorite topic in many popular-science 
books on mathematics, for instance, in 
Ya.I. Perelman’s Mathematics Can Be Fun 
(Mir Publishers, Moscow, 1985). 
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Part 2 we will give some instances of a 
quantity diminishing in time in such a 
manner that the rate of decrease is pro- 
portional to the quantity at a given in- 
stant: 

= — cy, c > 0. (4.7.5) 

From the foregoing it is evident that 
in this case the solution of the problem 
is the exponential function 

y = y 0 a x , a< 1. (4.7.6) 


Let us find the number e solely on 
the basis of formula (4.8.1). By the gen- 
eral properties of exponential func- 
tions, e° — 1 . Let us consider the func- 
tion y = e x . We have y (0) = 1 and 
y (0) = 1 (see (4.8.1)). 

We take a small Ax = r and com- 
pute the increment of y = e x when pass- 
ing from x = 0 to x = r. We know that 
we can write Ay~y'(x)Ax, which 
means that Ay ~ 1 X Ax — r and 
y (x) = y (0) + Ay , whence 


Exercises 


1 + r. 


(4.8.2) 


Find the derivatives of the following func- 
tions: 

4.7.1. y = 2*. 4.7.2. y = 5* +1 . 4.7.3. y = 


(1/2)*. 4.7 .4. y = 10^. 
4.7.6. y = 2 x+1 f x . 


4.7.5. y=2 


4.8 The Number e 


Let us find a base a for which the deriv- 
ative of an exponential function y = 
a x is of the simplest form, namely, 
such that the coefficient in the expres- 
sion for the derivative, k in (4.7.4), 
is equal to unity, so that it need not be 
written at all. We denote this base by 
e. Thus, by definition, 

de x . , . ^ .. 


dx 


This equation should be interpreted 
in the same way as we interpreted 
y (x + Ax) ~ y(x) + y'(x) Ax , that 
is, in (4.8.2) all terms of the second and 
higher orders, or terms proportional 
to r 2 , r 3 , etc., are ignored, while only 
the first order term (proportional to r ) 
is retained. What is important here is 
that the coefficient of r in (4.8.2) is ex- 
actly unity— this follows from the prop- 
erty of the number e formulated in (4.8.1) 

(this is simply the definition of e). 

We write the small number r as a 
fraction with a large denominator, 
r = 1 In: if r<C 1, then n >>1. 416 Then 
from (4.8.2) we get e 1/n ~ 1 + 1 ln r 
whence e ~ (1 4- 1 /n) n . x 

Since formula (4.8.2), which i 
nores terms proportional to r 2 , r 3 , etc 
is the more exact the smaller r is, tl 
last expression for e is the more exa 
the larger n is. We have thus arrive 
at another rigorous definition of e : 


(4.8.1) 


This number is easily found by formu- 
la (4.7.3): 

2.3 log e ~ 1, log e ~ 1/2.3 ^ 0.43, 

whence, referring to a table of loga- 
rithms, we get e ~ 2.7. This practical 1 

approach, however, does not follow the e— lim (l-\ ^ 

historical development of mathematics n ^°° n 

and is fundamentally unsatisfactory. 

We made use of numbers taken from 
logarithmic tables and did not stop to 
think how they were computed. 4 * 15 


(4.8. 


4 - 15 The accuracy with which e can be 
determined in this manner is low; it is highly 
improbable that, using a four-place table of 
logarithms for determining the derivative of 
10*, you could find the correct value of the 
second digit after the decimal point in e. 

10 dx i 

Actually, — ^ 2.302585, but of course 


dx 


this value was obtained without the use of 


logarithmic tables. Even less suitable a 
school logarithmic tables for finding a value 
e that can easily be remembered (there 
really no sense in this!): e ~ 2.7182818284590 
(15 decimal places!). Note that e cannot 
expressed by a periodic decimal fraction. 

On questions concerning the formulas (ai 
methods) that enable finding e with a ve] 
high accuracy, see Sections 6.1 and 6.2. 

4 - 16 The notation r<$; 1 means that t] 
number r is very small compared to unit 
similarly, the notation n > 1 means that n 
a very large number (n is considerably great 
than unity). 


4.8 The Number e 


143& 


(this is read: e is the limit of (1 + 
1 !n) n as n tends to infinity). 417 

Similarly, for a fixed k and a small 
r (such that kr <C 1) we get e hr ~ 1 + kr , 
which yields, if we put r = 1 In 
(where n is very large), e h ~ (1 + 
k/ri) n , or, to be more precise, 

e k = lim (l + "~) n * (4.8.3a) 

One should not fear such words as 
“limit” or “infinity”. Actually (1 + 
1/100) 100 ~ 2.705, which is only 
"Jiigrtliy Aiftreml t ^rrorn + ti?e u 
of e. We advise the reader to find (1 + 
1/8) 8 for himself. 

We have found that e r cz. 1 + r for 
small r, and this is the more exact the 
smaller r is. 418 Let us check this using 
numbers. We set up the following ta- 
ble: 419 

r -0.05 -0.4 -0.3 -0.2 -0.1 -0.01 0 
1 +r 0.5 0.6 0.7 0.8 0.9 0.99 1 
e r 0.6065 0.6703 0.7408 0.8187 0.9048 0.9901 1 


r 0.01 0.1 0.2 0.3 0.4 0.5 

1 +r 1.01 1.1 1.2 1.3 1.4 1.5 

e r 1 .0101 1. 1052 1 .2214 1.3499 1.4918 1.6487 


We see that even when r = ±3, the 
error introduced by (4.8.2) does not 
exceed 6%. A physicist or engineer 
must remember not only that e 
2.72 but also that e 2 ~ 7.4, e 3 ~ 
20, e 4 ~ 55, e b cx 150. Values of 
e x and e~ x are given in Table A4.1 
at the end of the book. 

The number e greatly simplifies the 
solution of problems involving geomet- 
ric progressions and compound interest. 
Consider the following example. How 

4 * 17 Of course, from the viewpoint of a pe- 
dantic mathematician the statement that from 
e i/n r^, 1 -f- 1 In follows e ~ (1 + 1 /n) n 
is not rigorous, since both sides of the approx- 
imate equality (valid only for n » 1) cannot 
always be raised to a large power n. Thus, 
our reasoning does not prove the definition 
(4.8.3), but just explains it. Nevertheless, 
we hope the reader will find the reasoning 
convincing. 

4 * 18 Detailed tables have been compiled for 
the function y — e x . 

4 * 19 The values of e x are taken from a 
four-place table. 


many times will production increase^ 
over a period of 50 years if the annual 
growth rate is 2%? We have to compute- 
1.02 50 . The use of the number e con- 
sists in our setting, approximately^ 
1.02 = e 0 - 02 (see formula (4.8.2), whence- 
1.02 50 ~ £>0.02x50 = e ~ 2.72. The* 
general formula is 

(i + r) m ~e m \ r< 1. (4.8.4> 

To apply this formula r must-be- 
small (there is no other requirement), 
while^ m . ancL mv. ne&cL not . . T f * 

mr is also small, then e mr ~ 1 -f mr r . 
and we obtain the earlier formula 

(1 + r) m ~ 1 + mr (4.8.5> 

(see formula (4.5.4)). However, we can- 
not use formula (4.8.5) if mr is large, 
whereas expression (4.8.4) remains val- 
id (provided that r < < 1). For the exa- 
mple given above, the exact value of 
1.02 50 (with three decimal places) i& 
2.693, formula (4.8.4) yields 1.02 50 ~ 
e 1 ~ 2.72, and formula (4.8.5) yields 
(1 + 0.2) 50 ~ 1 + 50 X 0.02 = 2. Com- 
putations using e yielded an error of 
about 1% whereas formula (4.8.5) yield- 
ed an error of about 25%. 

In general form an estimate of the preci- 
sion of formula (4.8.4) is given in Section 6.1 
(see Exercise 6.1.4). It will be found there- 
that the relative error introduced by this for- 
mula is of the order of mr 2 , so that formu- 
la (4.8.4) can be employed when mr 2 1 

and formula (4.8.5), as we already know, only 
when 1. But since (alas) the condition 

mr <C 1 for r small is much stronger than 
mr 2 < 1, there are values of m and r for 
which the approximation (4.8.4) is valid but 
(4.8.5) is not. For instance, in the above-con- 
sidered case of r = 0.02 and m = 50 we have 
mr — 1 and mr 2 = 0.02, which means that 
mr 2 is small and mr is not. In the same manner, 
for r = 0.001 and m — 10 000 we have mr — 10- 
and there is no way in which condition mr <c 1 
can be satisfied, while the condition mr 2 < 1 
can be assumed valid (cf. Exercise 4.8.2). 

In accordance with the original defi- 
nition of the number e by formula 
(4.8.1), the derivatives of exponential 
functions are especially simple when the* 
base is equal to e . These derivatives are 
conveniently expressed in terms of the* 
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function itself. Here are a number of 
formulas: 


y = e A 


dy x 


7 y — rp x dy = c ^ eX 

y ’ dx dx 


Ce x = y\ 


y = Ce hx , 


dy Q dekx 

dx dx 


= Ce hx 


d (kx) 


dx 


y=--e m ( x \ 


dy 

dx 


ky, 

de m (x) 

dx 


=• e m W 


dm (x) 
dx 


dm ix) 

— 77 1 • 


y 


dx 


y = f(x) e m ^ , -=/'(*) «"■(*> 

+ / (x) e m <* W (x) = !/ ( -y^p + m' (x) ) . 


The exponential function of x to 
.base £ is commonly written Often, 
however, another designation for e x 
is used: y = exp ^(read: the exponential 
of x). The law e x is called the exponen- 
tial law and the function e x is termed 
the exponential function (in the narrow 
sense). The designation with exp 
is especially convenient when x proves 
to be a complicated and unwieldy ex- 
pression. For example, it is more con- 
venient to write exp than 

/ 7 * 2 + 24 * \3 

<e K * 3 + 5 ' . 

The number e can also be defined geo- 
metrically. Two exponential functions, 
y = a x and y x = 6 X , are connected via 
a simple relation: 

b x — a hx , where k = log a b (4.8.6) 


(this fact has already been used in for- 
mula (4.7.2)). From (4.8.6) it follows 
that the graph of y x is obtained from 
the graph of y by shrinking the latter 
A-fold along the x axis (Figure 4.8.1a; 
cf. Section 1.7). 

It is clear that all graphs of the func- 
tion y = a x corresponding to different 
T v\hu>5b nfx a* "puss *hm)iq>ix an ft ’pun/*, 
{0, 1) (since a 0 — 1 for arbitrary a); 
however, the graphs intersect the y 
axis at this point at different angles. 



Figure 4.8.1 



The simplest exponential function is 
the one whose curve intersects the y 
axis at (0, 1) at an angle of 45° (strictly 
speaking, the tangent to the curve at 
this point intersects the x axis at an 
angle of 45° (Figure 4.8.16). From 
(4.8.1) it follows that this curve is the 
one corresponding to the function e x 
(recall the geometric meaning of the 
derivative). 

Conclusion. To summarize, we can 
give four distinct definitions of the num- 
ber e\ (1) from the condition (e x )' = 
e x , (2) from the condition e r ~ 1 + 
r for r<l, (3) as the limit of 
(1 + 1 ln) n as n -> oo, and (4) as the 
base in the exponential function y = e x 
(the graph of this function intersects 
the y axis at an angle of 45°). 

The number e has other remarkable 
definitions and properties; some of 
these will be discussed in Chapter 6. 

To fix this important material in his 
mind, the reader is advised to 
put aside the book and see how from 
each definition there follow the other 
three regarded as properties of number 
e. 

Note, finally, that in the equation 
y = e x the quantities x and y are, of 
course, dimensionless numbers, since 
e x with x = 100 m or x = 5 h has no 
meaning. Consequently, in the laws of 
science that involve the exponential 
law connecting two dimensional quan- 
+ iJin5b, T y onih ?&,, wa ohvAjyb hmvb j y 
l 2 e x/l \ where l x and Z 2 are the units 
of measurement of x and y. The transi- 
tion to a new unit for x, or Z' = cl x , 
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always leads to a transition from the 
old function y = e x to a new function 
y = e cx ' (equivalent to the old one), 
where x — xlc is the same quantity x 
but measured in new units Z'. Similar- 
ly, the substitution 1' 2 = l^d for the 
unit for y leads to a transition from for- 
mula y = e x to y' = de x (where y' 
is measured in units ofZ'). Correspond- 
ingly, all functions of the type y = 
ae kx , where a and k may be arbitra- 
ry, are called exponential functions, 
since in the majority of cases a transi- 
tion to new a and k means only a 
change in the units of measurement, so 
that only the fact that the two quanti- 
ties (x and y) are connected by an ex- 
ponential law and the signs of a and 
k (k positive corresponds to an increas- 
ing function y = e hx and k negative 
to a decreasing function y = e~\ h \ x ) 
have physical meaning. 

Exercises 

4.8.1. Find the derivatives of the following 

functions: (a) y — e~ x , (b) y = 5e x — e 3x , 

(c) y = e , (d) y=e v x , and (e) y=e 

4.8.2. The interest rate on a sum on a 
bank account is 0.1% every month. How much 
will the sum in the bank account increase in a 
thousand years? [Hint. Use formula (4.8.4).] 

4.9 Logarithms 

By definition, the logarithm of a quanti- 
ty h to a base a is the exponent / of 
the power to which a (the logarithmic 
base) must be raised in order to obtain 
the given number h : 

/ = 1 og a h means that h = a f . (4.9.1) 

In this sense the logarithmic function 
y = log a x is the inverse (see Section 
1.6) of the exponential function y = 

a x . 

The curve representing the relation- 
ship y — log a x (with a> 1) is depict- 
ed in Figure 4.9.1. Note that at x = 1 
(irrespective of the value of a) we have 
y = 0, for x > 1 we have y > 0, and 
for x < 1 we have y < 0. The entire 
curve is located to the right of the axis 
of ordinates: since a positive number 



Figure 4.9.1 

a raised to any power yields a positive 
number, there are no logarithms of neg- 
ative numbers. 4 20 
As seen from Figure 4.9.1, the deriv- 
ative of the function y = log a ^ 
(for a > 1) is positive for all values of 
x, since y increases with x (for a > 1). 
Note also that the rate of growth of 
y = loga & with x decreases, that is, the 
derivative decreases as x increases (be- 
low we will give a rigorous proof of 
this). 

Let us derive a formula connecting 
the logarithms of one and the same 
number to different bases. Suppose that 

t = \og a h, a f = h. (4.9.2) 

Taking the logarithms of both sides of 
(4.9.2) to base 6, we get 

/ log 5 a = log 6 h, whence f = . 

Taking (4.9.2) into account, we get 

1 °S.* = -Kg7- < 4 - 9 - 3 > 

This, can_ ha cawcitten. as, 
log& x — k log a x, where k=log b a. 

(4.9.3a) 

The number k == log & a is known as the 
modulus of logarithms to base b with 
respect to logarithms to base a. From 
(4.9.3a) it follows that the graph for 
y = log 6 x is obtained from the graph 
for y — log a x by a /c-fold stretching 
along the y axis. 


4 * 20 See, however, Section 14.3. 


1 0-0946 
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Logarithms to base e are called nat- 
ural logarithms and are denoted by In x . 
The “naturalness” of this system of 
logarithms is due to fact that in some 
respects this system appears to be the 
simplest; for instance, the rate of 
growth (the derivative) is the simplest 
in the case of logarithms if y = In x. 
These remarkable properties of natural 
logarithms, which we will repeatedly 
encounter below, led to a situation in 
which the two creators of the theory 
of logarithms, the Scottish amateur 
mathematician John Napier (1550- 
1617) and Swiss watch and instrument 
maker Joost Burgi (1552-1632), inde- 
pendently and almost simultaneously 
discovered precisely this type of loga- 
rithms (or very similar system of loga- 
rithms). The base- 10 logarithms are 
known as common logarithms and were 
first considered by Henry Briggs 
(1561-1630), an English astronomer, 
geometer, and numerical-table maker, 
prompted by Napier, who was his friend 
and whom he greatly admired. 4 * * - 21 

Natural logarithms can be character- 
ized by the property that the graph of 
y = In x intersects the axis of abscis- 
sas at point (1, 0) at an angle of 45°, 
since the curve is obtained from that of 
the inverse, y = e x , by reflection with 
respect to the bisector of the coordinate 
angle (see Section 1.6), and the curve 
y = e x intersects the axis of ordinates 
at point (0, 1) at an angle of 45° (see 
Figure 4.8.15). 

Let us now find the derivative of a 
natural logarithm. We consider d In x= 
In (x + dx) — In x. Take advantage 
of the familiar formula In a — In b = 
In ( alb ). Then 

<7 In a: = In — In (l + -f-) . 

(4.9.4) 


logarithms of both sides; since In e r = 
r, we arrive at an extremely im- 
portant formula: 

In (1 + r) ^ r (4.9.5) 

(the approximate equation has the 
same meaning as (4.8.2): it is valid for 
r<c 1). Combining (4.9.4) and (4.9.5), 
we get 

d In a: = In (1 + — ) = — 

\ 1 X ) X 


(employing dx instead of Ax makes it 
possible to assume that the last equa- 
tion is exact), whence 


d In x _ 1 

dx x 


(4.9.6) 


When x varies in geometric progres- 
sion, In x varies in arithmetic progres- 
sion, that is, if x = a, ab, ab 2 , ab 3 , . . . , 
then lnx = In a, In a + c, In a + 2c, 
In a + 3c, . . ., where c = In b. 

For this reason, the larger x is, 
the slower In x grows and the small- 
er is the derivative, which is reflected 
in formula (4.9.6). 

The derivative of a natural logarithm 
can also be found by using the fact that 
the logarithmic function and the expo- 
nential function are inverse functions 
(with respect to each other). We can 
write 


y — In#, x — e y , x' 

dy _ 1 = 1 

dx eV x # 


d(eV) 

dy 


= e v , 


Now, using (4.9.3a), we can calculate 
the derivative of a logarithm to any 
base a. Suppose that y = log a x . Then 
(see (4.9.3) and (4.9.3a)) 


In £ dy 1 d In x 1 

In a 9 dx ~ In a dx In a x 


(4.9.7) 


We already know (see formula (4.8.2)) 
that e r ~ 1 + r for r small. Take the 


4 - 21 In the beginning, common logarithms 

were known as Briggs logarithms and natural 

logarithms as Napierian logarithms. 


Replacing a with e and b and h with a, 
we get In a = (log a c)”\ we can rewrite 
(4.9.7) in the following manner: 

d l0gq X = lOggg (4 g 7 a \ 

dX X * \ ) 
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The simplest one of the formulas 
(4.9.6), (4.9.7), and (4.9.7a) is (4.9.6), 
which is valid for logarithms to base 
e . For rough mental calculations, it is 
advisable to memorize the following 
facts: In 2 ~ 0.69, In 3 ~ 1.1, and 
In 10 ~ 2.3 ~ 1/0.434. A short table 
of natural logarithms is given in Table 
A4.2 at the end of the book. 

If some function f (x) is under the 
sign of the logarithm, the derivative 
is found by the rule for differentiating 
composite functions (see Section 4.3): 

d In / (x) __ 1 df (x) ,4 g gx 

dx f (x) dx ' \ / 

The derivative (In / (x))' of the loga- 
rithm of a function f (x) is often called 
the logarithmic derivative of / (x)\ ac- 
cording to (4.9.8), it is equal to the ra- 
tio of the derivative f (x) to the func- 
tion itself, / (;r). Here is an example 
that illustrates the usefulness of this 
concept. 

Formula (4.9.8) enables one to find 
the derivatives of expressions of the 
form / (^) /l ( 3C) , that is, such that contain 
the variable both in the base and in the 
exponent. Suppose that 

y = f (x) h W. (4.9.9) 

Taking logs (the logarithms can be taken 
to any base; we choose natural loga- 
rithms), we obtain 

In y — h (, x ) In / (x). (4.9.10) 

Let us now take the derivatives of 
both sides in (4.9.10) and have regard 
for the fact that In y is a composite 
function of x (just as In f(x) is): 

Y y '^ h ' (*) ln f( x ) + h W-jy]- * 
y' = y [V (x) In f (x) + h (x) -j~-] , 

or, allowing for (4.9.9), 
y f = f (x) h ( X) h' (x) In / (x) 

+ h{x)f {xfW-'f (x). (4.9.11) 

Consider this formula. On the right 
we have a sum of two terms: the first 
term, / (x) h Wti (x) In / ( x ), is the deriv- 


ative of f h computed on the assumption 
that only h is a variable while / 
is held constant, and the second term, 
h (x) f (x)^*)- 1 f'(x), is the derivative 
of f h computed on the assumption that 
only / is a variable while h is held con- 
stant. This confirms the general prin- 
ciple expressed in Section 4.2. 

Note, finally, that in y = log a x 
(just as in y = a x ), the variables x 
and y and the constant a are dimension- 
less. If # is a dimensional quantity 
(say, length), then the number x is 
specified only to within a constant fac- 
tor depending on the chosen unit for 
x. Correspondinly, the quantity lo g a x 
(for any fixed base a) is determined on- 
ly to within a constant term (to within 
an additive constant , as mathematicians 
usually say in such cases); in that respect 
it is similar to the indefinite integral. 4 - 22 
If both x and y are dimensional quan- 
tities, then the right way to approach 
this problem is to write y= l 2 log a ( xll x ), 
where the factors l x and Z 2 are 
units of measurement for x and y, 
since a logarithm can be taken only of 
a dimensionless quantity, say of num- 
ber 100 obtained as a result of taking 
the fraction, say (distance AB)!(x\mi 
Z x ), where l x could be one meter. The 
value of a logarithmic function is also 
a dimensionless quantity, whereby the 
dimensional quantity must be equal 
to Z 2 log a (z/Zi), that is, log a (, xll x ) units 
Z 2 . These simple considerations must 
be allowed for since the logarithmic 
function (just as the exponential func- 
tion) is often encountered in the laws 
of science. 

It must be also noted that, in view of 
(4.9.3) and (4.9.3a), the transition from 
one base a to another base b is reduced 
to multiplying all logarithms to the old 
base a by a constant, k= lo g b a , that is, 
to a change in the unit for y = log a x. 
This is why, in the majority of cases, the 
base of a logarithm does ! not - play an 

4 • 22 As for the deep connection between 
logarithms and integrals (which partially 
explains this analogy), see formula (5.2.2) 
(which sometimes is used to define logarithms) 
and the subsequent discussion in Section 7.7. 
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important role, which makes it possible 
almost always to employ natural loga- 
rithms as being the simplest. 

Exercises 

4.9.1. What is the value of log 3 15? [Hint. 
Use formula (4.9.3)]. 

4.9.2. Derive the formula for the deriva- 
tive of (a) the product of two functions, using 
the formula log ( uv ) — log u + log z;, and 
(b) the quotient of two functions, using the 
formula log (u/v) = log u — log v. 

4.9.3. Find the derivative of each of the 
following functions: (a) y — In (a: +3), (b) z/ = 
In 2x, (c) y = ln(z 2 + 1), (d) y = ln(x+i/x), 

(e) y = ln(3*»-*+l), (f) y = In , 

Vx 

<g )y =ln 3 — _ =T, (h) y = x\nx, (i) y = 

V *+i 

x 3 In ( x 1), (j) y — x x , and (k) y = 

y'xV**- 1 . 

4.10 Trigonometric Functions 

In this section we will find the deriva- 
tives of trigonometric functions. 

Trigonometric functions are defined 
as ratios of line segments, ratios of the 
sides of right triangles or ratios of cer- 
tain line segments in a circle (the line 
of sines, the line of cosines, etc.) to 
the circle’s radius. For this reason trig- 
onometric functions are dimensionless, 
and the independent variable in such 
functions is also dimensionless, an an- 
gle. The natural measure of angles in 
higher mathematics is the radian , a 
central angle subtended in a circle by an 
arc whose length is equal to the radius 
of the circle. A central angle of t radians 
is subtended in a circle of radius 1 by 
an arc whose length is t . The importance 
of trigonometric functions to the de- 
scription of physical processes will be 
discussed in Part 2 (see Chapter 10). 

In what follows we will always con- 
sider a circle of unit radius. Then we 
will briefly say that the sine is equal to 
the length of the line of sines in that 
circle, the angle is equal to the arc 
length, and so on. The reader must bear 
in mind, however, that both trigonomet- 
ric functions and angles are dimen- 
sionless quantities and are not measured 



Figure 4.10.1 

by any units of length (centimeters, 
inches, or meters). The sine is equal to 
the length of the line of sines (in 
centimeters) divided by the length of 
the radius (in centimeters,) and only 
when r = 1 cm is it numerically equal 
to the length of the line of sines. The 
lines of sines and cosines are shown in 
Figure 4.10.1. 

Recall the form of the graphs of sine 
and cosine as functions of the angle 
(Figure 4.10.2). The period of the sine, 
like that of the cosine, is equal to 
2n ~ 6.28 and corresponds to a complete 
revolution of the radius of the circle. 

Let us find the derivatives of the sine 
and cosine geometrically. In Figure 
4.10.3, the endpoint of the radius drawn 
at an angle cp is A, and the endpoint of 
the radius drawn at a close angle 
cp + dtp is B. The length of arc A B 
is therefore equal to dcp. Draw from A 
a perpendicular AC to the line of sines 
BB r of the angle cp -f- dcp. As can be 
seen from Figure 4.10.3, 

^4^1' = sin cp, BB' — sin (cp -f- 
BC = sin (cp -f- dq>) — sin cp 
— d sin cp 

(here, of course, the last equation is 
“exact in the limit”, as dcp ->0). More- 
over, 

OA' = cos cp, OB' — cos (cp -f- dcp), 
A'B' = AC = cos cp — cos (cp -f- dcp) 

= — d cos cp. 
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Since the angle dcp is small, the arc 
length AB does not differ much from 
the length of the chord AB and the an- 
gle ABC formed by the chord AB and 
the vertical line BOB' is equal to cp . 4 * 23 
From a consideration of triangle ABC 
we find that BC = AB coscp and 
AC = AB sin cp. Thus, 

d sin cp = cos cp di p, 

—d cos cp = sin cp d( p 


and, hence, 


d sin (p 
dcp 


= cos cp, 


d cos cp 
d<p 


— sin cp. 

( 4 . 10 . 1 ) 


Such a simple and pictorial way of calcu- 
lating the derivative of the sine or cosine is 
quite convincing for a beginner. On the other 
hand, an experienced and knowledgeable reader 
(not the one for whom this book is intended) 
will notice the difficulties inherent in such an 
approach. For a small but finite (i.e. not in- 
finitesimal) angle Acp the length of chord AB 
is less than the arc length: it can be proved 
that 


AB = Acp 


(Acp ) 3 , 
12 " r 


Furthermore, actually the angle CBA is cp + 
Acp/2 instead of cp, so that 


BC = AB cos (cp-f- j = ( Acp 


(Acp) 3 

12 


+ 


j (cos cp — ^ sin cp+ ... j . 


4 - 23 The tangent line t to the circle at 
point A forms with the vertical line A A' an 
angle A' At equal to cp (Z A' At — z AO A' 
as angles with mutually perpendicular sides); 
A ABC differs very little from A A' At since 
chord AB has practically the same direction 
as tangent t . 


Therefore, strictly speaking, the above rea- 
soning is not rigorous and the relationships 
are incorrect. However, they are true to within 
infinitesimals of first order with respect to the 
small quantity Acp. Hence, while for incre- 
ments we have only the approximate equations 
A sin cp ~ Acp X cos cp and A cos cp ~ —Acp X 
sin cp, the corresponding equations for differ- 
entials are exact, which means that Eqs. 
(4.10.1) are exact, too. 

The derivation of the formulas for the de- 
rivatives of trigonometric functions we have 
just described is equivalent to the derivation 
based on mechanical analogies (for one, a deri- 
vation that incorporates the idea of a vector). 
We imagine a point M = M (t) whose move- 
ment in the plane is described by two para- 
metric equations: 

* = T (*)> y = ^ (t), (4. 10. 2> 

where t is the time variable (see Section 1.8). 
In this case the rate of variation in the coordi- 
nates x and y of point M will be given by the 
derivatives x' = dx/dt and y' = dy/dt. The 
vector MM!/ At = v av = (Ax/At, Ay/ At), where 
Ax/ A t and Ay/ At are the coordinates of the 
vector (see Figure 4.10.4), characterizes the 
average velocity over the line segment MM X 
of the path (here M 1 = M (t + At)). Vector 
v = (dx/dt, dy/dt) characterizes the instan- 
taneous velocity at point M of the path: it is 
directed along the tangent to the trajectory at 
point M and its length is equal to 

. . A s ds 

lim —rr — -jr , where As ~ | MM ± | is the 
At-* 0 1X1 at 

increment of the distance s over time A*. 

Now suppose the point M moves uniform- 
ly in a circle of radius 1; it travels, say, 1 cm 
in 1 s. Then Eqs. (4.10.2) assume the form 

x — cos t, y = sin t, (4.10.3) 


and the velocity v of the motion is given by a 
vector with coordinates d (cos t)/dt, d (sin t)/dt. 
Clearly, in this case vector v will be of unit 
lengthf (since byf hypothesis the increment 

V 



Figure 4.10.4 
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As of distance is equal to the increment At 
of time) and directed along the tangent to 
the circle (at every point; see Figure 4.10.4). 
The coordinates of vector v, which is perpen- 
dicular to the radius OM = r of the circle, 
are obviously — sin t and cos t (compare the 
right triangles AOM and VMP whose sides 
are mutually perpendicular in Figure 4.10.4; 
they have the same hypotenuses OM = V M = 
1 and equal acute angles A AOM = A MVP = 
t ; the minus sign in front of sin t here specifies 
the direction of the line segment MP). Thus, 
we again arrive at formula (4.10.1): 


The second term on the right-hand 
side of (4.10.4) must first be trans- 
formed by using the familiar formula 
1 — cos 2a — 2 sin 2 a , whereby 1 — 
cos Acp = 2 sin 2 (Acp/2). In this formula, 
we substitute Acp/2 for sin (Acp/2) 
(this is justified since Acp is small). 
Hence, 

1 — cos Acp ^ 2 (Acp/2) 2 _ Acp 
Acp Acp ~ 2 * 


d cos t 
dt 


— sin t , 


d sin t 
It 


COS t. 


Finally, we will give another meth- 
od for calculating the derivatives of 
sin cp and cos cp, which does not require 
a drawing and is more formal. According 
to the general formulas, A sin cp = 
sin (cp + Acp) — sin cp* Recall the 
formula for the sine of the sum of two 
angles, 

sin (oc -f- (3) = sin a cos (3 
+ cos a sin (3, 

and apply it to sin (cp -f- Acp) to get 
sin (cp -f Acp) — sin cp cos Acp 
+ cos cp sin Acp, 
whence 

A sin cp == sin cp cos Acp 
4 cos cp sin Acp — sin cp 

— cos cp sin Acp 

— sin cp (1 — • cos Acp). 

Let us form the ratio of the increments 
as follows: 


A sin cp 
Acp 


— sin cp 


cos cp 


sin Acp 
Acp 


1 — cos Acp 
Acp 


(4.10.4) 


Now we have to send Acp to zero and 
thus pass to the limit. We know that 
for angles a or Acp tending to zero, the 
sine is equal to the arc length, sin a ~ 
a, or sin Acp Acp as Acp ->0. In 
other words, 


lim 

A<p-*0 


sin Acp 
Acp 


Hence, in the limit, as Acp -> 0, the 
second term vanishes: lim 1 ~^ os A( P == 

A<p-0 A( P 

0, whence lim := = 

A<p-0 A( P ^q> 

cosqp. Similarly, 

A cos cp = cos (cp 4- Acp) — cos cp 
= cos cp cos Acp — sin cp sin Acp — cos q) 

= — cos cp (1 — cos Acp) — sin cp sin Acp 


(here we have used the formula for the 
cosine of the sum of two angles), and 
therefore 


A cos cp 
Acp 


— sin q> 


= — cos cp 

sin Acp 
Acp 


1 — cos Acp 
Acp 


Since we already know that 


lim 

A<p-**0 


1 — cos Acp 
Acp 


=o, 


lim _ 4 A( P 
A<p-0 A( P 


the final result is 


1, 


d cos cp j. A cos cp 

d v ~ A <p™ A( P 

= — cos cp x 0— sin cp x 1 = — sin cp. 

Equations (4.10.1) are valid for ar- 
bitrary angles and not only for those in 
the first quadrant. It is also useful to 
verify, glancing at the graphs of the 
functions sin x and cos x, that Eqs. 
(4.10.1) give the proper signs of the de- 
rivatives for any x. 

Let us check the validity of Eqs. 
(4.10.1) for small angles. For small cp, 
it is obvious geometrically that 
sin cp ~ q) and cos cp ~ 1, where the 
approximate equality sign just as 
in many other cases, means equality to 
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within terms of the first order of 
smallness. The first equation in (4.10.1) 
for small 9 yields d (sin (p)/dcp ~ 1 and 
the second yields d (cos cp)/rfcp ~ — cp, 
which means that d (cos 9)^9 = 0 at 
(p = 0. The fact that the derivative is 
zero signifies that the cosine has a max- 
imum at cp = 0, while the fact that 
d (sin 9)/^ = 1 at 9 = 0 signifies that 
for small 9 the sine increases approxi- 
mately like 9. 

If we know the derivatives of the 
functions y = sin x and y = cos x, 
it is easy to find the derivatives of all 
other trigonometric functions by using 
the formulas interrelating them. For 

instance, tan x = — By the formu- 

la for finding the derivative of a quo- 
tient we get 


d tan x d (sin s/cos x) 

dx dx 

__ cos x cos x — sin x ( — sin x) 

COS 2 X ’ 

whence 

d tan x _ cos 2 x-\- sin 2 x __ 1 , ^ gx 

dx cos 2 x cos 2 x * ' 


By a completely analogous device we 
find that 


d cot x 
dx 


1 


sm 4 x 


(4.10.6) 


The derivatives of a tangent and a cotan- 
gent can also be found directly. Note that 

, 0 sin a sin 6 

tan a — tan p £- 

cos a cos p 

sin a cos ft — sin ft cos a _ sin (a — ft) 

~ cos a cos P ” cos a cos |3 ’ 

whence 

A tan 9 = tan (9 + A9) — tan 9 
sin A9 


cos (9 + A9) cos 9 


(4.10.7) 


Bearing in mind that lim ^ _ 1, we 
A<p-*-0 A 9 

get, from (4.10.7), 
d tan 9 A tan 9 

c?9 ~ ay~>0 A9 

= lim sinA( P- 1 


Av-*0 ^9 

1 


cos (9 + A9) + cos 9 


= IX 


cos 2 9 cos 2 9 * 


Exercises 


From Figure 4.10.5 (the graph of 
tan x) we can see that the function 
y = tan x has a positive derivative for 
arbitrary x . Near the points of discon- 
tinuity (x = Jt/2, x = 3ji/2, . . .) the 
derivative increases without bound, and 
both the function and its derivative 
become infinite at these points. Both 
of these conclusions are in full agree- 
ment with formula (4.10.5). 



Find the derivatives of the following func- 
tions: 

4.10.1. y ~ sin (2x + 3). 4.10.2. y = 

cos (x — 1). 4.10.3. y = cos (x 2 — x + 1). 

4.10.4. y = sin 2 x. 4.10.5. y = sin 3x cos 2 x . 
4.10.6. y = (sin 2x) x . 4.10.7. y — x tan x. 
4.10.8. y = e tan 2x . 4.10.9. y = cot (s/2). 


4.11 Inverse Trigonometric Functions 

New and very interesting results are 
obtained when we consider the inverse 
trigonometric functions . We remind 
the reader of how these functions are 
defined. The function 

y — Arcsin x (4.11.1) 

is an angle y whose sine is equal to x: 
sin y = x. (4.11.2) 

These two equations denote the same 

thing. Similarly, the function 

y = Arctan x 


(4.11.3) 
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Figure 4.11.1 


is an angle y whose tangent is equal to 
x : 

tan y = x. (4.11.4) 

The definitions are similar for the func- 
tions y = Arccos x (i.e. x = cos y) 
and y == Arccot (i.e. x = cot y). Note 
that the functions y = Arcsine and 
y = Arccos x are meaningful only for 
— 1 < x < 1 (cf. (4.11.2)). The functions 
y = Arctan x and y = Arccot x are 
meaningful for all values of x. 

Let us consider in more detail the 
function y = Arcsin x . For instance, 
let x = 1/2. Then y = Arcsin (1/2). 
We can take y = nl 6 since sin (n/6) = 
1/2; however, we can also take y = 
5jt/6, since sin (5 k/ 6) is also equal to 
1/2. We can likewise take y = 13jt/6, 
y = 17k/ 6, and so on. We see that one 
value of x is associated with an infin- 
itude of values of y. All that this means 
is that Arcsin x is multiple-valued 
(see p. 49) and even infinitely valued . 
These properties of y = Arcsin x are 
seen in the graph of Figure 4.11.1. 

Take the portion of the curve repre- 
senting the function x = sin y on 
which the function is monotonic; usu- 
ally the portion of the curve y = 
Arcsin x on which this condition is 
satisfied is such that — n/2^. y ^ k / 2 . 
This part of the curve is called the 



Figure 4.11.2 


principal value of the function Arcsin x 
and is denoted by y = arcsin x. If we 
confine ourselves to a consideration of 
y = arcsin x, then to each x there cor- 
responds only one value of y. Thus, 
the function y = arcsin x is single- 
valued. 

The principal value of the arctan- 
gent function is defined in similar 
fashion: 

— ~C arctan 

(since the function x = tan y is mono- 
tonic on the segment — Jt/2 < y < jt/2, 
Figure 4.11.2). 

Let us find the derivative of y = 
arcsin x. We take advantage of the 
fact that the arcsine is the inverse of 
the sine, y = arcsin x and x — sin y. 
We then have 

x (u) ==—— = cos y, 

dy (4 11 5) 

;/ v dy 1 1 

y (x) — — r~— = -77T - . 

a v ' dx x (y) cos y 

But we consider x the independent vari- 
able and not y, so dyldx should be ex- 
pressed in terms of x and not in terms 
of y, as in (4.11.5). 

From sin 2 y -f co s 2 y = 1 it follows 
that cos y = ± ]/ 1 — sin 2 y. Since 
we are considering the principal value 
of the arcsine function, it follows that 
— jx/2^ jx/ 2, cos y^ 0, and so 
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we take the plus sign in f ront of the 
radical sign: cos y = ]/ 1 — sin 2 y. 

Since sin y = x by (4.1 1.2), it follows 
that cos y = ~\f \ — x 2 . Substituting 
this into (4.11.5) yields 

dy 1 d arcsin x 1 

— — or — — , 

dx Y \—x 2 ’ dx Y 1 —x 2 l ’ 

(4.11.6) 

This formula may be used only for 
the principal value. If we want it to 
be valid for other values, we must choose 
the appropriate sign in front of the 
radical Y 1 — x 2 (this radical can be 
considered double-valued). Indeed, for 
one and the same value of x, the deriv- 
ative has different signs: on various 
portions of the curve at points A and 
C of the graph of the function y = 
Arcsin x (Figure 4.11.1) the deriva- 
tive is positive and at points B and D 
it is negative. 

Let us now find the derivative 
- ar ^ an X - If y~ arctan x , then x = tan y, 
whence, by the foregoing, we find that 


*'(») = 


*■<*>=-£= 


1 


dx 

dy cos 2 y ’ 

1 

dx x' (y) 


(4.11.7) 


cos 2 y. 


From trigonometry we have 

tan 2 y + 1 = — % — , 
v 1 cos 2 y 9 

and therefore 
1 


cos 2 y 


1 +x 2 . 


Using formula (4.11.7), we finally get 

dx dx l-j-z 2 * (i-ll.o) 

This formula holds true for any other 
branch of the arctangent (see Figure 
4.11.2) since any other branch is ob- 
tained from the principal one by a paral- 
lel translation, and this does not affect 
the magnitude of the derivative. 


Exercises 

Find the derivatives of the following func- 
tions: 

4.11.1. y = arccos x (this function is de- 
fined by the condition 0< ji). 4.11.2. y = 
arccot x (here 0 < y < jx) . 4.11.3. y — arcsin 2x. 
4.11.4. y — arctan (3x + 1). 4.11.5 y = 

arctan (;r 2 — z). 4.11.6. y — e arctan 

4.12 Differentiating Functions 
Dependent on a Parameter and 
Functions of Several Variables. 

Partial Derivatives 

Often in physical problems we are forced 
to find the derivative of a function 
y (x) which, aside from depending on 
the independent variable x, depends 
on one or several parameters, orfquan- 
tities that characterize the system being 
investigated, say, masses or linear di- 
mensions, and that are assumed con- 
stant, fixed, for the system considered 
but in general may have different val- 
ues. Clearly, if the function y — 

/ ( x , t) depends on the independent 
variable x and the parameter t , its 
derivative y' = dyldx will also depend 
on the same parameter t. For instance,, 
if y = sin (Xx) (X is the parameter) r 
then y' (. x ) = dyldx = X cos (tar) and 

y" (x) = d 2 yldx 2 = — X 2 sin (tar); if 

y = e~ aX2 (a being the parameter), 
then y' (x) = dyldx = 2axe ~ aX 2 and 

y" (x) = d?yldx 2 = 2a (2ax 2 — l)e~ axZ 

(verify this). 

Note that since the function y = 
f (x, t ) and its derivatives y' = 
dyldx , d 2 y!dx 2 , etc. depend not only 
on x but on parameter t , which may 
assume different values, it would be 
interesting to know to what extent y 
and y' and y'\ all taken at a fixed val- 
ue of x, depend on parameter t if the 
latter is slightly varied. The answer 
lies in the value of the rate of change of 
y with t, that is, in (dy/dt) x= constant- 
For instance, in the case of the two func- 
tions we considered above, y x = 
sin (tar) and y 2 = e~ aX '\ the rates 
dyJdX and dyjda are, respectively, 
x sin (tar) and — x 2 e~ aXZ (x is fixed but 
X and a vary). 
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Obviously, if we are dealing with a 
function y = f (x, t), the notation y\ 
y", etc. is ambiguous since it does not 
allow us to know whether we are speak- 
ing of (dyldx) t= constant or of an entire- 
ly different, quantity (dy/dt ) x== constant . 
To explain the notation used here, we 
turn to the general case of a function of 
two (independent) variables , z = F (x, y), 
since the above is a particular case of 
such a function. True, the function was 
denoted by y and the independent vari- 
ables by x and t , but nothing prevents 
us from choosing other letters. 

If we assume that y = y 0 is constant 
and x varies, we will arrive at a func- 
tion z = F (x, y 0 ) = cp (x) of one vari- 
able x . The derivative cp' (x) of such a 
function can, naturally, be written as 
(dF (x, y)/dx) y=Vo= constant- Since such 
notation is cumbersome and not too 
useful, we use OF (x, y)ldy, where the 
symbol d (the “round dee”) is used 
instead of d to emphasize the fact that 
the derivative is taken at y constant , 
that is, the function F (x, y) is consi- 
dered as a function of one variable x 
{it depends, however, also on parameter 
y). The derivative 


z = F (x, y) we must fix both inde- 
pendent variables, x = x 0 and y = y 0 , 
that is, (dF(x, y)/dy) x = Xo , y=yo means 
that we have the partial derivative of 
F {x, y) with respect to x calculated 
on the assumption that y — y 0 and 
taken at the point x = x 0 (in other 
words, the derivative of cp (x)=F(x,y 0 ) 
with respect to x, which is equal to 

lim \ . Note also that if 

Ax->0 A * 1 

the value z of the function F {x, y) 
is measured in units of E and the varia- 
bles x and y are measured in units of e l 
and e 2 , respectively, then the partial 
derivatives dzldx and dzldy have diffe- 
rent dimensions, E/e 1 =/= EIe 2 * 

The partial derivatives of a function 
z = F (x, y) enable estimating the in- 
crement Az = F (x -f- Ax, y -f- A y) — 
F ( x , y) of the function caused by 
small variations in x and y, or incre- 
ments Ax and A y that are small. Ob- 
viously, 

Az = F (x + Ax, y + Ay) 

= [F (x + Ax , y + Ay) 

— F (x,y + Ay)] 


dF (g, y) 
dx 


d£{x , y) 
dx 


Inconstant 


lim 

Ax-+0 


F(x+_ As, y) — F(x, y) 
Ax 


is said to be th e partial derivative of 
F (x, y) with respect to the independent 
variable x: similarly, 


dF {x, y) __ dF(x, y) I 

^Uy J a'y |x=confetafit 

__ L (£» y+ A y) ~ F( ^ x ' y) 

Ano A y 


is the partial derivative of F (x, y) 
with respect to y. For example, if 
y = y (x, X) = sin (>ix), then dyldx = 
X cos (Xx) and dyldX — x cos (Xx), 
while if y = y (x, a) = e~ aX \ we 
have dyldx = — 2axe~ aX2 and dylda = 
—x 2 e~ aX2 . 

Of course, in calculating the values 
of the partial derivatives dF (x, y)ldx 
-and dF (x, y)!dy of a function 


+ IF (x, y + Ay) — F {x, y)]. 


But the difference A x z = F (x -|- Ax, 
y -f- Ay) — F (x, y -f- Ay) is the incre- 
ment of a function of a single variable, 

F (x, y -f- Ay) I ^-f-Aj/.=constant *P (*^)> 
due to an increase in the indepen- 
dent variable x by Ax; according to 
formula (4.1.2), A x z = cp (x -|- Ax) — 

\x)Ax = a ^t^L ‘Ax. 

the difference F (x, y -|- 
y) = A 2 z is the incre- 
the function (y) = 
F (x, y) Inconstant of a single vari- 
able y due to an increase in the inde- 
pendent variable y by Ay: according 

to (4.1.2), A 2 z ~ I|)'(y) A y = dF{ * y y - Ay. 
And since we can always assume that 
for Ay small we have ^+ A ^ 

dF(x , y) 
dx ’ 


cp'(x) ^ cp' 

Similarly, 
Ay) — F (x, 
ment of 


the final 


dx 

expression for 


the 
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increment A z = AjZ + A a z of the func- 
tion z = F (x, y) is 

V- ( 4 - 12J ) 

It is clear that the accuracy of the ap- 
proximation (4.12.1) is the higher , the 
smaller the increments Ax and A y of the 
independent variables x and y of function 
z. In other words, for small Ax and A y 
{which here are assumed to be of 
the same order of smallness), the differ- 
ence Az — Ax + |^ Ayj will be of 

a higher order of smallness, and, as Ax 
and Ay tend to zero, the ratio of the 
left- to right-hand sides of (4.12.1) (where 
it is assumed, of course, that the 
right-hand side does not vanish) tends 
to unity. For instance, if z = xz/ 3 , 
then, obviously dzldx = y 3 and dzfdy = 
3xz/ 2 , and small increments Ax and 
Ay lead to an increase in z by approxi- 

mately y 3 Az + 3 xy 2 A y ( = —Ax + 

PM 

Although the partial derivatives 
dzldx and dzldy of a function z = 
F (x, y) are measured in different 
units, this in no way influences formula 
{4.12.1). If x, y , and z are measured in 
units of e l9 e 2 , and E, then the deriva- 
tives dzldx and dzldy have the dimen- 
sions of E/e x and E/e 2 , but the incre- 
ments Ax and Ay have dimensions that 
coincide with those of xand z/, that is, of 
e 1 and e 2 , so that the two terms on the 
right-hand side of (4.12.1) have the 
same dimensions, of E, which coincide 
with the dimensions of the left-hand 
side, Az. If we go over to a new unit 
for x, that is, e[ = ke x , the partial de- 
rivative dzldx will get multiplied by k, 
but the numerical value of Ax will be- 
come k times smaller, so that the product 
(dzldx) Ax will remain unchanged. 
But if we go over to a new unit for z, 
or E' — KE , then the partial deriva- 
tives dzldx and dzldy will be divided 
by K and, therefore, the increment Az 
given by (4.12.1) will be K times smal- 
ler (as was to be expected). 



Figure 4.12.1 


The meaning of (4.12.1) can be 
explained geometrically. A function 
z = F (x, y) of two variables x and y 
is defined at all points of a plane domain 
® (the domain of the function) where 
each point is fixed by its two coor 
dinates, x and y . Points Af(x, y) 
and N (x -f Ax, y -f- Ay) are the 
opposite vertices in the rectangle 
n = MPNQ depicted in Figure 4.12.1. 
Movement along the straight lines MP 
and QN changes only the variable x, 
so that if we confine ourselves to these 
two straight lines, the function z 
can be considered a function of only 
one variable x (these are the functions 
(p (x) — F (x, y) | j/= co nstant & n d cp x (x) = 
F (x, y + Ay) | y +A,y=constant)? the 
increment (due to the transition from 
M to P and from Q to N) can be found 
via (4.1.2). In the same fashion, if we 
move only along the sides MQ and PM 
of the rectangle, the function z trans- 
forms into a function of only one 
variable y (these are the functions 
^(y)=F(x, y)\ 

x=constant and % (y) = 
F (x -(- Ax, y) \ jc+Aa:=constant- each 
vertex of the rectangle in Figure 
4.12.1 we have written the (approxi- 
mate) values of the function z. 

The situation with a function with a 
number of variables greater than two is 
similar. For instance, if u = f (x, z/, z ,£), 
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then the increment of the function 
that is caused by the increments in all 
four independent variables, Ax, Ay, A z, 
At, is 

A “-£ A *+-| l r *+£-**+%-*' 

(4.12.1a) 


(13.4.3)). Therefore, if we start with 
the function W = W {C, cp), we get 

I? = 4- cp 2 = which is the" rate 

O C, <p Lt C J 

of variation of the energy as the capa- 
citance varies (and the voltage is kept 
constant), while if we start with the 
function W = W (C, q), we will find 


where, 


say, 


du _ df (x, y, z , t) 
dx dx 


y=constant, 

z=constant, 

i=constant, 


. dW 1 q 2 W , 

that W q = — Tc = ~ ~c < wlth the 
charge on the capacitor being constant); 
thus, 


etc. The accuracy of formula (4.12.1a) 
will be the greater, the smaller the 
quantities Ax, Ay, A z, At are. 

Note also that the function z = 
F (x, y) actually connects three vari- 
ables, x, y, and z, and each variable 
can be considered a function of the oth- 
er two. For instance, y is a function 
of x and z, since by fixing the values 
x = x 0 and z = z 0 of the variables 
x and z we can, using the equation 
z 0 = F (# 0 , y), find the value of y (it 
could happen that there are more than 
one value, but this would only mean 
that z = f (x, y) specifies ^ as a mul- 
tiple-valued function of x and z). This 
fact enables one to determine such par- 
tial derivatives as (dy/dz) x== const ant and 
(dy/dx) z==co nst ant and the like. In phys- 
ical problems the choice of variables 
on which a given quantity z depends 
can be very different. Two different 
functions z = F (x, y) and z = G (x, u) 
where y and u are different physical 
parameters on which z depends, en- 
able finding two different values of 
the partial derivative dz/dx, which 
would be appropriately denoted by 

{dz/dx) y— C0VLS tant and {dz/dx) u — constant 

(or, briefly, by and g|j. It is 

clear that the two derivatives have the 
same dimensions, those of E/e, where 
E and e are the units of measurement 
of z and x, but numerically they are 
not necessarily the same. For instance, 
the energy IF of a capacitor is (l/2)Ccp 2 = 
(1/2 )q % /C, where C is the capaci- 
tance, cp the voltage across the capacitor, 
and q the quantity of electricity, or 
charge, on the capacitor (see formula 


dW I _ dW I 
dC |(p — “ dC Iq' 

It is clear that the partial derivatives. 
dz/dx and dz/dy of a function z = F (x, y) 
(with respect to x and y) are defined for all x 
and y from the domain O of the function F, 
that is, are themselves functions of x and y. 
Hence, we can define the partial derivatives of 
partial derivatives, or what is known as par- 
tial derivatives of second order: 

d / dz \ d 2 z d l dz \ d 2 z 
dx \ dx } dx 2 ’ dy \ dx ) ~ dx dy ’ 

d / dz \ d 2 z d I dz \ _ d 2 z 

dx \ dy / dy dx ' dy \ dy j ~~ dy 2 

For instance, in the case of the function z = 
xy z , which we have already met before, we- 
have d 2 z/dx 2 = d (y z )/dx — 0, d 2 z/dxdy = 
d (y s )/dy = 3 y 2 , d 2 z/dydx = d (3 xy 2 )/dx == 3 y 2 r 
and d 2 z/dy 2 = d (3 xy 2 )/dy = 6xy. Thus, we see 
that d 2 z/dydx = d 2 z/dxdy . 

We can easily see that this is not accidental 
but is a general law: d 2 z/dxdy — d 2 zldydx. In- 
deed, according to the definition of a deriva- 
tive (or formula (4.1.2)), 

d 2 z d I dz \ 
dx dy ~~ dy \ dx J 


dz(x , y + ky) dz(x , y) 



where the approximate equality may be as- 
precise as desired (we need only select Ay small 
enough). On the other hand, it is obvious, 
that 

dz(x, y + Ay) 
dx 

^ + y + Ay)— z(x, y + Ay) ^ i 2 .3a> 

Ax 

dz{x, y) _ z {x + Ax, y) — z(x, y) 12 3hV 

Tx “ Ax ’ 

where the approximate equalities become 
exact equalities as Ax 0. Substituting 
(4.12.3a) and (4.12.3b) into (4.12.2), we get 
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d 2 zj 1 r 2 (x + Ax, y+Ay)— z(s, y+Ay) 
dx dyj ~ ’ Ay L A# 

z(x + Ax, y)—z(x, y) 1 

Ax J 

_ Z (x + Ax, y'+ Ay) - z (x, y + Ay) 

Ax Ay 

z (ff Ax , y) z ( x , y) (4 12 4b.) 

Ax Ay 

(note that in the numerator on the right-hand 
.•side we have the sum (with alternating signs) 
of the values of function F at the vertices of 
rectangle II depicted in Figure 4.12.1, and in 
the denominator, the area of H). Similarly 

d 2 z d / dz \ 

dy dx dx \ dy ) 

dz(x-{-Ax, y) dz(x, y) 

^ dy dy 

Ax 

and 

dz (x-{- Ax, y) 

dy 

z(x+ Ax, y-{-Ay) — z(x-\-kx, y) 

1 T~ 

Ay 

dz (x, y) z ( x i/ + Ai/)— z(x, y) 

~ Ay 

whence 

J*z J__ f z (x-\-Ax, y+Ay) — z(x+Ax, y) 

dy dx Ax L Ay 

z(x, y + Ay) — z (x, y) 1 

Ay J 

_ z (ar-f- Ax, y + Ay) — z (s + Ax, y) 

Ax Ay 

£ (f> y + Ay) — z (£, y) 

Ax Ay ' (4.12.4b) 

Since the approximate equalities (4.12.4a) and 
(4.12.4b) may be made as accurate as desired, 
we conclude that d 2 z/dxdy = d 2 z/dydx. 

Partial derivatives of higher orders are 
defined similarly; the value of the derivative 
is determined by how many times we diff- 
erentiate with respect to each variable and 
not by the order in which this is done. It is 
quite easy to generalize the facts for the case 
of a function of three or more variables, but 
we will not do this here. 

If in space we introduce a Cartesian system 
of coordinates (see Figure 4.12.2) consisting 
of three coordinate axes, x, y, and z, then the 
'‘graph” of the function y = F (x, y) is a surface 
(2). This surface is formed by all the points 
M (x, y, z) obtained as a result of the follow- 
ing process. On each straight line that is 
perpendicular to the zy-plane and erected at 


point P ( x , y) (the perpendiculars are paral- 
lel to the z axis) we lay off a segment PM 
of length z (this segment is laid off upward or 
downward depending on whether z is posi- 
tive or negative). Suppose that the surface 2 
is smooth in the neighborhood of a point 
M (x o, y 0 , *o)> with z o = F(x 0 , y 0 ), that is, 
it has no salient points or discontinuities or 
“cusps” (like the vertex, or apex, of a conical 
surface) or similar “singular” points. In this 
case the tangents to all the curves that pass 
through point M and belong to 2 form a plane 
o known as the tangent plane (the plane 
tangent to a surface at a certain point). Clear- 
ly, the tangent plane o is completely defined 
by the tangents t x and t 2 to the curves along 
which the planes y = constant (=y 0 ) and 
x = constant (=^ 0 ) cut 2, and the slopes of t x 
and t 2 in relation to the zy-plane are given by 
the values of the partial derivatives dzldx 
and dz/dy. (Why?) It is easy to see that the 
equation of plane o in the xyz coordinate 
system has the form 

z — z °= — J/o). (4.12.5) 

where, say, dzldx is (dz/dx) x=:xo , y = y 0 . 

The right-hand side of (4.12.1) {dzldx) Ax + 
(dz/dy) Ay , is called the total differential of 
function z = F (x, y) and is denoted by dz . 
If z = x (i.e. F (x, y) = x) y then, obviously, 
the partial derivatives dzldx and dz/dy of this 
simplest function of two variables (which 
actually depends only on x) are 1 and 0, respec- 
tively, and therefore the total differential of 
this function is dx — 1 X Ai + OX Ay = 
Ax. Similarly, it can be proved that dy = Ay, 
where on the left-hand side we have the total 
differential of the function G ( x , y) = y of two 
variables (actually of one variable) and on 
the right-hand side we have the increment of 
y. Since Ax = dx and Ay = dy , we can re- 
write the formula for the total differential 
of z = F ( x , y) as 

dzr-^^-dx+^-dy. (4.12.6) 

dx dy 

The equation of the plane a tangent to 
surface 2, (4.12.5), clarifies the geometrical 
meaning of the total differential. This equation 
implies that the Z-coordinate (below we will 
see why the letter “Z” is more convenient than 
the letter “z”) of point N x (x, y, Z) in plane a, 
a point corresponding to the value x 0 + Ax 
of the ^-coordinate and to the value y 0 + Ay 
of the y-coordinate, is equal to z 0 + dz (com- 
pare with (4.12.5), where the increments x — 
x 0 and y — y 0 of the x- and y-coordinates are 
now denoted by Ax and Ay). On the other 
hand, the z-coordinate of the respective 
point N ( x , y, z) on 21 is equal to z 0 + Az, where 
Az is the increment of z, or simply z — z 0 . 
Thus, the approximate formula (4.12.1), 

z ~ z 0 + dz, (4.12.7) 
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suggests that, for small Ax and A y, the point 
N (x, y, z) of 2, where x — x 0 + Ax and 
y — */o + A y, will lie sufficiently close to 
point N 1 (x, y, z), where 2 = z 0 + Az and 
Z — z 0 + dz, that belongs to plane a tangent 
to 2 at point M (;r 0 , y 0 , z 0 ) (see Figure 4.12.2). 
The error introduced by the approximate for- 
mula (4.12.7) is a quantity of second order of 
smallness if compared with dx and dy (or 
Ax and Ay), which are assumed to be quanti- 
ties of equal order of smallness. 

Quite similarly, for a function of, say, 
four variables, u = f (x, y, 2 , f), the total 
differential du is the expression on the right- 
hand side of (4.12.1a) in which the increments 
Ax , Ay, Az, At of the independent variables 
can be replaced with dx , dy, dz, dt. Thus, 
we can rewrite the approximate formula 
(3.12.1a) as 

A u ~ du. 


Exercises 

Find the first partial derivatives of the 
following functions: 

4.12.1. z = x 2 -h y 2 . 4.12.2. z = l/(x 2 + 
y 2 ). 4.12.3. z = Vx 2 +y 2 . 4.12.4. z = 

e -(x*+y*). 4.12.5. z = x 2 y z . 4.12.6. z = 
xeV + ye *. 4.12.7. z = In (x + y). 4.12.8. z = 
cos (xy). 

4.12.9. Find the second partial derivatives 
of the functions in Exercises 4.12.1-4.12.8 
and check whether d 2 z/dxdy = d 2 z/dydx. 

4.13 The Derivative of an Implicit 
Function 

To define a function implicitly means 
to define it via an expression of the form 

F (. x , y) = 0. (4.13.1) 


If the equation can be solved for 
x or y, then we revert to the ordinary 
representation of the function in ex- 
plicit form. But sometimes such a solu- 
tion leads to complicated formulas and 
at other times it cannot be found at all. 
For instance, the equation of a circle 
in the form 

X 2 +y 2_ l = 0 (4.13.2) 

is simpler than the following expression 
derived from it: 

y(x)= ±YT^ 2 . (4.13.3) 

If the left-hand side of (4.13.1) is an 
arbitrary polynomial involving x and 
y to a power exceeding the fourth, then 
in general this equation cannot be 
solved for x or iovy. Also, for example, 
unsolvable is the simple-looking equa- 
tion 

F (x, y) = x sin x -f- y sin y — ji = 0. 

(4.13.4) 

However, even in those cases where 
there is no solution in the form of a 
formula that specifies directly a proce- 
dure for computing y for a given x, 
it still remains a fact that y is a definite 
function of x . For every x it is possible, 
by solving the equation numerically, 
to find a corresponding y and to con- 
struct a curve (4.13.1) in the xy- plane. 
It may be that the curve will not exist 
for all x (in the case of a circle (4.13.2), 
for example, it exists only for x be- 
tween — 1 and -j-1) or that the function 
is multiple-valued, that is, for a given 
x there may be more than one value of 
y (in the case of the circle, for instance, 
there are two values in accord with the 
dz sign in front of the square root sign). 
However, these complications do not 
detract from the basic fact, which is 
that the equation F ( x , y) = 0 defines 
y as a function of x. 

How can we find the derivative 
dyldxt And can this be done without 
solving the equation, that is, without 
expressing y {x) in explicit fashion? 

The procedure which enables finding 
y’ from Eq. (4.13.1) was developed by 
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Newton. Let the variables x and y 
satisfy Eq. (4.13.1) and let x -f- Ax 
and y + Ay be adjacent values of the 
variables that still satisfy Eq. (4.13.1), 
where the word “adjacent” emphasizes 
the smallness of Ax and Ay. Since 
z — F (x, y) = 0 and z -f Az = 
F (x + Ax, y + Ay) = 0, we 
have Az = F (x + Ax , y + Ay) — 
F { x -> y) — 0* But, in view of the basic 
formula (4.12.1), we have 


Az ~ 


dF (£, y) 

dx 


A# 


dFJx, y) 
dy 


A y, 


which means that 

-W Ax + ^7 a ^-° ( 4 - 13 - 5 ) 


and, so 

dF (x, ij ) 

A y dx 

Az dF (x, y) ’ 

dy 


(4.13.6) 


where the approximate equality in 
(4.13.5) and, therefore, in (4.13.6), is 
the more exact the smaller the incre- 
ments Ax and Ay are. 

Rasstinxg + hPv, Bmit* ffJiisL 

will also mean that Ay 0), we get 
the derivative on the left, and the ap- 
proximate equality in (4.3.6) becomes 
an exact equality. Finally, we have 


dy 

dx 


dF (x, y) 
dx 

dF(x , y) 

dy 


(4.13.7) 


Note the minus sign in (4.13.7) and also 
the fact that we cannot simply cancel 
the dF (x, y) in the numerator and the 
denominator. 

We will demonstrate the application 
of (4.13.7) using as an example 
Eq. (4.13.2). We have F (x, y) = x 2 -\- 
y 2 - 1. Then 


dF Or, y) _ 
dx 

dy __ 2x 
dx 2 y 


dF (x, y) 

dy 


= 2 y, 


(4.13.8) 


It is easy to see that this result coin- 
cides with that obtained if we compute 
the derivative of (4.13.3). 


Let us find the derivative in the case& 
of (4.13.4): 

fdF(x , y) . . 

r V, ■ = sin x + x cos x , 

dx ’ 

— = sin y + y cos y, 

dy _ sin x + x cos x 
dx ~ sin y+y cosy # 

Thus, the expression of the derivative* 
of an implicit function involves both 
quantities, x and y. To find it numeri- 
cally, we must find y numerically for 
a given x. But if we did not have for- 
mula (4.13.7), then to find the deriva- 
tive would require finding, numerical- 
ly, two values y 2 and y x for two adjacent 
values x 2 and x 1 and finding the ratio 
(y 2 — yi)l ( x 2 — x i) an d the limit of 
this ratio. 4 - 24 Here, the closer x 2 is to 
x l9 the more exactly we would need to 
compute y 2 and y ±1 and this is often very 
difficult to do. On the other hand, 
the use of (4.13.7) presents no difficul- 
ties. 

If F(x , y) = 0 leads to a nonsingle- 
valued function ii(x). that is, when therfr 
are two or more values of y for one 
value of x (several branches of the 
curve), then (4.13.7) yields the values of 
the derivative at appropriate points of 
the curve F (x, y) = 0 when a given x 
and different y's are substituted. The 
reader is advised to verify this by using 
the equation of the circle, (4.13.2), 
for which the derivative is given by 
formula (4.13.8). 

Let us assume that we have a func- 
tion of two variables, z = F (x, y ), for 
which we do not require that it be zero. 
Instead we will consider this function 
along a curve y in the xy - plane fixed 
parametrically by the equations x = 
cp ( t ) and y = \|) (£), where parame- 
ter t can be the time variable, for in- 
stance. The value of z will then depend 


4 - 24 Additional difficulties arise when 
Eq. (4.3.1) specifies a multiple-valued function. 
y = y (x), since then one must “match” 
the values y x (x x ) and y 2 (x 2 ); otherwise for 
a small difference x 2 — x 1 the difference y 2 — 
y r may prove to be large. 
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on t, too. Let us find the rate of varia- 
tion of z, or the derivative dzldt . 

We introduce a small increment A t 
into the variable t. The variables x 
and y will then increase by Ax ~ 
<p' (t) A t = (dxldt) At and Ay ~ 
nj/ (t) At = (dy/dt) At. On the other 
hand, by (4.12.1), we have 


A z~-||- Ax-f- 


dz 

dy 


A ^2 


dx 


dx dt 


At 


dz 

dy dt ’ 


whence 

A z ^ dz Ax , dz A y 
At ” dx At ‘ dy At ’ 


(4.13.9) 


and the approximation is the more exact 
the smaller the increment At is (and, 
hence, the smaller the Ax and Ay are). 
We now send At to zero in (4.13.9). 
The ratios Az/At , Ax! At and Ay! At 
will transform into the respective deriv- 
atives and the equality in (4.13.9) 
will become exact. The final result is 


dz _ dz dx , dz dy 
dt ~~ dx dt ^ dy dt 


(4.13.10) 


In particular, if the curve y is speci- 
fied explicitly, by an equation y = 
/ (x) (i. e. if parameter y is the abscis- 
sa x), then (4.13.10) transforms into 


dz 

dx 


dz dy , dz 
dy dx ' dx 


(4.13.11) 


For instance, if z = xy , the rate of 
growth of z along the circle x~acost, 

. . . dz dz dx , dz dy 

it — a sm t is -j— — -r - — — ^ — ~77~ ~ 

v dt dx dt dy dt 

y ( — a sin t) + x (a cos t) = a 2 cos 2 t — 
a 2 sin 2 t — a 2 cos 2 1, while along the 

dz 

straight line y = kx we have — 


dz dy 
dy dx 


----- xk + y = 2 kx. 


Formulas (4.13.10) and (4.13.11) spec- 
ify the increment of the z-coordinate 
of a point M of the surface 2 (whose 
equation is z = F (x, y ); see Figure 
4.12.2) caused by a shift along a curve 
T lying on 2, a curve whose projection 
in the xy - plane is curve y with equa- 
tions x = x (t) and y = y (t) (or y = 
t (x)). The same formulas can be ap- 


plied to a function with a higher num- 
ber of variables. Suppose that T is the 
temperature at a point M (x, y, z) in 
space and that the temperature varies 
with time £; we can say that the temper- 
ature is a function of four variables: 
T = T (x, y, z, t). We wish to find 
the rate with which the temperature 
changes when we go over from point 
M [x, y, z) = M ( x 0 , y 0 , z 0 ) to a neigh- 
boring point Afj ( x + dx, y + dy, 
z -f- dz) (this transition occurs over a 
small time interval dt , we assume). 
Ignoring the changes in temperature 
in time, we arrive at a stationary 
(or steady-state or time-independent) 
temperature distribution T — T (x, y, z) 
(~ T (x, y, z, t 0 )). Here the rate of 
temperature variation caused by tran- 
sition from point M to point M u in 
accord with formula (4.13.9) modified 
in the appropriate manner, is 

— dy | dT dz 

v con q x "r Qy . q z dt 

This change in temperature, as we have 
already said, is caused by a shift in 
space from one point, M (x, y, z), to 
another point, M 1 (x -f dx, M + dy , 
z 4- dz), along a certain curve x = 
x (£), y = y ( t ), z = z (0, say, a 
transition from a point with a lower 
temperature to a point with a higher 
temperature; in accord with this rate 
of change of temperature, u con associat- 
ed with such a shift is called the con- 
vective rate. Now let us turn to the study 
of the temperature variation in time at 
a certain point M (x 0 , y 0 , z 0 ). Here the 
temperature depends only on time t , 
that is T = T (t) (= T (x 0 , y 0 , z 0 , t)). 
The rate of variation of temperature 
associated with this function (the fact 
that the temperature field also depends 
on time) is called the local rate : 

dT 

dt • 

Finally (compare this with formula 
(4.13.11), the total rate of temperature 
variation can be found by adding the 
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local rate to the convective rate : 

_ . _ dT _ dT dx 

l hot- ^con-i-^loc — dt — Q X dt 

| d JL d JL | d Jj §1 d _L 

dy dt ‘ dz dt ' dt 

(the derivative dTIdt is sometimes 
called the total derivative of T = 

T (x, y, z, t), where x = x (f), y = y ( t ), 
z = z (t)). 

In finding the derivative of an impli- 
cit function (or the derivative of a func- 
tion z = z (t) specified in the form of a 
composite function of two variables, 
z = F (x, y), where x — x (t) and y = 
y (£)) we had to introduce a new con- 
cept, that of the partial derivative (see 
Section 4.12). This notion is of great 
importance to the theory of functions 
of several variables, on which we have 
only briefly touched here. Actually, 
we have already, latently, made use of 
the concept of a partial derivative. 
For instance, the definite integral I = 

b 

f(x)dx = /(a, b) is a function 

a 

of two variables: above we found the 
partial derivative di/da (= — / (a)) and 
dl/db (= / (6)) with respect to one of 
the limits of integration (a or b) assum- 


ing the other limit of integration 
(b or a) fixed (or constant). Even in such 
elementary questions as finding the de- 
rivative of the product of several func- 
tions, y — h (x) g (a:), or as finding the 
derivative of y = h (x) g < x > (see Section 
4.9), the concept of a partial derivative 
was actually present. Indeed, as was 
mentioned earlier, y' in these cases is 
composed of terms obtained when tak- 
ing the derivatives with respect to x 
in h (x) and with respect to x in g (x), 
which clearly involves partial deriva- 
tives. The corresponding general rule 
can be written as follows: if y = 
F h (r)], then, in accord with 
(4.13.10), 

> _ jy _ dF_ dg d£ dh 

dx dg dx dh dx * 


Exercises 


4.13.1. Find the derivative dy/dx of a 
function defined by Eq. (4.13.4) at the follow- 
ing points: (a) z = Jt/2, y = ji/2, and 
(b) x = — ax/2, y — ji/2. 

4.13.2. Find the derivative dy/dx at the 
point x = y = 1 of a function defined by the 
equation x z + 3x + y* — 8 = 0. 

4.13.3. Let z — Y' x 2 -\- y 2 . Find (a) dz/dt 
with x = eht cos (Zi)fand y = eM sin (It), and 
(b) dz/dx with y = x 2 . 


11-0946 



Chapter 5 Integration Techniques 


5.1 Statement of the Problem 5 * 1 

In Chapter 3 we introduced the concept 
of the integral and noted the close con- 
nection between two different (at first 
glance) problems. The problems are 

(1) finding the sum of a large number 
of small summands when the terms can 
be represented as v ( t ) At (or v ( t ) dt); 

(2) finding the function z (t) whose de- 
rivative is equal to a given function 
v (t): 

= (5.1.1) 

Most of the problems that arise in phys- 
ics, engineering, chemistry, as well 
as in mathematics, are problems of type 
(1), that is, they involve finding the 
sum of a large number of small sum- 
mands. This statement of the problem 
is more pictorial and suggests a simple, 
though approximate, way of computing 
the quantity of interest, namely, by 
directly adding the (small) summands 
of which we spoke earlier. However, 
this “direct” method of solving prob- 
lems of type (1) does not make it poss- 
ible to express the answer by any gen- 
eral formula, and higher mathematics 
appeared when the relationship was es- 
tablished between problems of types 

(1) and (2), which opened the way for 
general algorithms for solving problems 
of type (1). 

The statement of problems of type 

(2) is more artificial, but it has its ad- 
vantages. The finding of derivatives 
proved to be a very simple matter, 
which reduced to four or five formulas 
(the derivatives of x n , e x , In x, sin x , 
cos x) and two or three rules. It is there- 
fore easy to find the derivatives of a 
large number of functions. Every time 
the derivative dzldt = v of some func- 
tion is found, we can record the fact 
that for this v the integral z is known 
(see Section 5.2). In this way we can 
build up a range of particular cases in 


6 - 1 Before reading this chapter, the reader 
advised to reread Chapter 3. 


which it is possible to solve problems of 
type (2). For certain simple types of 
functions v , it has been possible, with 
the aid of algebraic identity transform- 
ations, to find the general rules of 
solving problems of type (2), say, when 
v is a sum of functions whose integrals 
are known (see Section 5.3). 

However, this is not possible 
for all the elementary functions, so 
that integration is more difficult than 
differentiation (finding derivatives). Nev- 
ertheless, the formulas found for cer- 
tain integrals in the second statement 
of the problem are very important. If 
for a given v (t) it is possible to find 
the integral (the indefinite integral or 
the antiderivative) z (t), then all prob- 
lems in the first statement, all sums, 
that is, all definite integrals 

b 

v (£) dt , are then expressed by 
simple formulas via the function z (t): 

b 

j v ( t ) dt —z ( b) — z (a). 

a 

Such a result is more complete, more 
exact, and more valuable than the re- 
sult of every separate numerical com- 
putation of a sum, that is, of the defi- 

b 

nite integral j v (t) dt between definite 

a 

limits a and b. For this reason we aim 
primarily to solve the problem of in- 
tegration in its second statement. 

5.2 Elementary Integrals 

Let us write down the formulas for 
derivatives that have been found in 
Chapter 4 and the corresponding (in- 
definite) integrals: 

( x n ) = nx 77-1 , n j x 71 " 1 dx — x° + C; 

-JL ( e kx ) = ke kx , k j e hx dx = e hx + C ; 

-^-(Inx) — — , [ —dx~lnx + C; 

dx x 1 x 1 J x 
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(sin kx ) = k cos kx. 


k j cos kx dx = sin kx -f C\ 
(cos kx) = — k sin kx, 

— k j sin kx dx = cos kx + C; 


dx 


(tan x) 


cos 2 X 


1 


cos 2 X 


dx = t&Rx-\-C\ 


&-( cota; ) = 


In 


sin 2 x 


dx = cot x + C\ 


±. {& vcsinx)=yj=, 

f — dx = a rcsi n x 4- C\ 

J y i— x i 

(arctan x) = - , 


1+Z 2 


1+X 2 


da^arctanz + C. 


Let us perform a few manipulations. 
In the first integral, we denote n — 1 
by m, that is, n = m + 1, and re- 
write it thus: 


j x m dx= 1 ^ T x m+l + C. (5.2.1) 

It is clear that the formula is valid for 
all m except m = — 1; for m — — 1 
the denominator vanishes and x m+1 = 
x° = 1, so that we arrive at an 
expression unsuitable for computa- 
tions: 1/0 + C. However, it is precisely 
in the case where m = — 1, that is, for 

c 

j ( 1/x ) dx that we have the formula 5 - 2 

j ~dx = lnx+C. (5.2.2) 

This formula holds true only for posi- 
tive values of x , since In x is meaning- 
ful only for x > 0. For x < 0, In x 


5 - 2 In Section 7.6 we will discuss the 
marked difference between formulas (5.2.1) 
and (5.2.2) for integrals of power functions. 


is meaningless, but In ( — x) is meaning* 
ful. Since 

d In ( — x) __ 


dx 




for x C 0 we have j dxlx = In (— x) -f- 

C. Both formulas for J dxlx can be 
combined into one: 


j r -^==ln \x\+C. (5.2.3) 

This formula may be used for any do- 
main of integration that does not con- 
tain x = 0. 

The integral of the exponential func- 
tion can be written thus: 

j e hx dx = ±e hx +C. (5.2.4) 


Similarly, for the sine and cosine 
we get 


(• 51 4 

\ sinkxdx = — — coskx + C , (5.2.5) 
j cos kx dx = -y sin kx + C. (5.2.6) 


We conclude with an example that 
shows the necessity of a careful exam- 
ination of the function and the danger of 
a purely formal approach. We will 
M b 

evaluate the integral I = J dxlx*. The 

a 

indefinite integral is given by the for- 
mula [ dxlx 2 = — tlx + C, whence 




b — a 

^ab~ m 


(5.2.7) 


Since the integrand is positive, the 
result must be positive, too, provided 
that b > a. The answer is indeed posi- 
tive if b > a and both a and b have the 
same sign. However, for the integral 

l 

j* dxlx 2 formula (5.2.7) yields a clear- 
- 1 

ly absurd result I — —2. The reason 
for this is that the integrand becomes 
infinitely large inside the domain of 


n* 
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integration (at x = 0 ), and at this 
point the — Hx experiences an infinite 
jump (this function is the indefinite in- 
tegral of 1 /x 2 ). 

To find the true reason for all this, 
we # must exclude a small neighbourhood 
of the “singular point” x = 0 from the 
domain of integration — 1 < x < 1, 
that is, we must take the interval 
e x c x C e 2 , where e x and e 2 are small 
positive numbers, and consider the sum 



It is natural to assume that integral / 
is obtained from K if we send e x and 
e 2 to zero. But formula (5.2.7) yields 


K = 1 ~~ 81 jl i ~~ &2 

6i r e 2 



e 2 


When e x and e 2 tend to zero, K tends 
to oo. 

In other cases an integral whose in- 
tegrand becomes infinitely large within 
the domain of integration may assume 
a finite value. For instance, 

i 

j dx/Y x = 2. 

0 

Indeed, the corresponding indefinite 
integral is given by the formula 

1 fr= J *~' ndx 

" = ’Tj2 xUi + C = 2yx + C, 

whence 

D ' 


Exercises 


Evaluate the following definite integrals: 
l 2 

x 2 

o i 

n 1 

dx 


L 

3.2.1. j a: 5 dx. 5.2.2. j -4- d* 


5.2.3. ^ sin x dx . 5.2.4. J 


0 
jt/4 


5.2.5. j e*dx. 5.2.6. j 




da; 


5.3 General Properties of Integrals 

T ~±fi'^ctnm& 4.1“ 16 i.o we Gstdbilsnetf tne 
formulas for the derivative of a sum 
(and difference) of functions, the deriv- 
ative of a product (and ratio) of func- 
tions, and the derivatives of a com- 
posite function and of the inverse of a 
function. To each of these properties 
there corresponds a definite property 
referring to integrals. This section and 
Sections 5.4 and 5.5 are devoted to 
finding the “integral” analogs of the 
properties of derivatives considered. 

For integrals we have the following 
formula: 

j [Af(x) + Bg(x)\dx 
= A j f(x)dx + B j g(x)dx. (5.3.1) 

To prove it, we need only take the de- 
rivative of the expression on the right. 
If the formula is true, then we will ar- 
rive at the integrand. Indeed, 

[-4 j / (*) dx + B j g {x) dx J 

= A [ j / (*) dx \ ~ B [ j g (*) dx ~\ 

= Af(x) + Bg (x). 


Which tends to 2 as e -> 0. 

Consideration of this kind must al- 
ways be carried out if the integrand be- 
comes infinitely large; here, however, 
we will not discuss this important top- 
ic any further (see Section 6.3). 


We have thus proved the validity of 
(5.3.1). It shows that the integral of a 
sum of several terms splits up into the 
sum of the integrals of the separate sum- 
mands, and any constant factors can be 
taken outside the integrrl sign . 
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As an example to illustrate the above 
statement, we consider the integral of 
a polynomial : 

J ( a 0 x k + a + a 2 x k " 2 + . . . + a k _ 2 x 2 
-\-a k _ l x-\-a k ) dx = a 0 j x k dx 
+ a i ^ x *“ 1 dx + a 2 [ x h ~ 2 dx-\- ... 

+ a k -2 i 4- f x dx + a k \ dx 


#0 

*+1 


x h+l\J!±. x h 


a 2 


k— 1 


yh — i 


+ 


+ ^_ x 3 + ^L x 2 + a ^ + c> (5.3.2) 

Whence, for one thing, 


m 

\ (a 0 x h + + a 2 x h ~ 2 + a k . 2 x 2 

n 


+ a k _ iX + a k ) dx = -j+- ( m h+l — n h+1 ) 

+ -j- (m k — n h ) + ( m k ~ l — n h ~ l )+. . . 

+ fl g~ 2 (m 3 — re 3 ) + ak ~ 1 ( m 2 — re 2 ) 

+ — re). 


For instance, 

j (2a: 2 — 6a; + 1) dx — -|-a: 3 — 3a: 2 + £ + C, 

2 

j (2a: 2 — 6a: + 1) dx = ~ (8 — 1) 

l 

— 3(4— l) + (2-l)=— 3 + 

It is possible, under the integral 
sign, to make a change of variable and 
pass to a new and more convenient 
variable. We will discuss this question 
in detail in Section 5.5, while here we 
examine a number of simple examples. 

1. Find | (ax + b) n dx , with n ^ — 1. 

For the new variable, call it z, we 
take the expression in the parentheses: 

ax + b =z. (5.3.3) 

In so doing, we also have to pass from 
differential dx to differential dz . From 


(5.3.3) we get dz — a dx , or dx — a~ x dz % 
Thus, 


j (ax + b) n dx = j 

~n + i 

4 -C= - 


Z ndz_^± C 
a a J 
(< ax + b) n+1 


a (t 2 + 1) 


a ( n -j- 1) 


z n dz 

C. (5.3.4) 


The correctness of the result is easily 
seen if we compute the derivative of 
the Tight member: 

d r (ax-^b) n+1 . r ~\ d f (ax +b) n+1 ~| 

dx L a (n + l) ' J dx L a(n-\-i) J 


72+1 
a (72 + 1 ) 


{ax + b) n + (ax + b) 


^±^a = (ax + b)\ 


2. In similar fashion, in the inte- 

C fix 

gral l — — r- we can make the change 
j ax | u 

of variable z — ax + b, dx — a' 1 dz , and 
hence 


i 


dx 1 

az + 6 — a 
In \ax-\-b\ 
a 


j'4=4‘"i*i + c 

= C. (5.3.5) 


When we are dealing with such 
simple examples in practical situations, 
the transformations are ordinarily car- 
ried out without introducing separate 
designations for the new intermediate 
variables. For example, instead of 
(5.3.4) one writes 


j (ao:+ b) n dx= j (aa: + b) n d(ax-\- b) 


1 

(72 + 1) CL 


(ax + b) n+ ' + C. 


We now give a more complicated 
example of an integral that can be re- 
duced to the integrals we already know 
via algebraic transformations. 

5 dx 

(x — a) (x—'b ) ' 

Note that 


1 1 CL b tr q 

x — a x — b {x — a) (x — b) ’ \ / 

wvhinhi ~fi ^nJfiuws A hnti 


1 _ 1 / 1 1 \ 

(x — a) (x — b) a — b \ x — a x — b ) * 



166 


5 Integration Techniques 


whence 

f 7 T7 tt = — - — r f (— — — 7 ) da; 

J ( x — a)(x — b) a — b J \x — a x — b) 


a — b 
1 

a — b 


[In \x — a \ — In | a; — 6] | -}- 67 
x — a 


In 


x — b 


+c. 


(5.3.7) 


are found via the method of undetermined 
coefficients (see the hint to Exercise 5.3.7); 
Cx dx 


to find , 9 . . 

J x 2 + l 

denote ;r 2 + l by z.J 


it proves convenient to 


5.4 Integration by Parts 


(Formula (5.3.7) can be used in any do- 
main of integration that does not in- 
clude the points x — a and x = b.) An 
integral of any algebraic fraction, that 
is, the ratio P n (x)/Q m (. x ) of polynomi- 
als of degrees n and m , can be ex- 
pressed as a combination of elementary 
functions (algebraic functions, loga- 
rithms, and arctangents). Several 
simple examples of this type are given 
in the exercises below. 


Exercises 


Find the following (indefinite) integrals: 


5.3.1. 


j (3x* — 2x+l )dx. 5.3.2. ( (4r 4 — 


3:r 3 + # 2 — x) dx. 5.3.3. j # (z — i)* dx. 

5.3.4. f **+ 2 *+ 3 5.3.5. C *4*. 

J X J X — 1 

5.3.6. ^ dx. £ Hint. Employ the fact that 

ax-\-b a . be — ad "1 f xdx 

cx+d~ c 1 c(cx + d)' \ ' J (x—2) (x—3)' 

Hint. Employ the identity 7 ^ rr == 

L {x — 2) (x — 3) 

1 — — -z — [ ~~n ; the numbers A and B are 

x — 2 1 x—3 


found via the method of undetermined co- 
efficients, that is, by equating the coefficients 
of identical powers of x after clearing 

fractions.J 


j -i+ 2 


-[ 


dx. Hint. Use the identi- 


ty 


x-\- 2 


+ x 

x + 2 

' x{x* + l) 


A B 

x + z 2 + l 


Cx 


Let f (x) and g (x) be two distinct fun- 
ctions of the variable x. The rule for 
finding the derivative of a product 
yields 

i m=s-§-+i£. (5.4.1) 

This enables us to write 

t ‘= j 

- J >-§-*=+ j f %r d *- (5-4.2) 

We can see the validity of (5.4.2) by 
taking the derivatives of the left and 
right members to get the true equa- 
tion (5.4.1). 

Let us rewrite (5.4.2) as 

\fik dx= =te-\ *i£ dx ’ 

or, which is the same, 

\fdg = fg—^gdf. (5.4.3) 

What is the meaning of this formula? 
When evaluating an integral, there is 
unfortunately no rule that expresses 
the integral of a product of two func- 
tions in terms of the integrals of each 
of the factors. However, if in the prod- 
uct of two functions fw the integral of 
one of the factors is known, or 

j w dx = g, u> = -| J, (5.4.4) 

then it becomes possible to express the 
integral j* fw dx in terms of an inte- 
gral involving the derivative dfldx . 
Using (5.4.4), we rewrite (5.4.3) in 
the form 

j fwdx = f ( j wdx} — j ( j wdx^j- x dx. 

(5.4.5) 


<r 2 4- 1 


, where the ~ coefficients A , B , and C 
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Since dx = g, it follows that the 

last integral in (5.4.5) is simply^ g (df/ 
dx) dx . Sometimes it is simpler than 
the original integral ^ fw dx or reduces 

to a known integral. If f is a power 
function or a polynomial, dfldx is 
also a power function or a polynomial 
whose power is that of / minus unity. 

Formula (5.4.3) or (5.4.5) is called 
the formula for integration by parts . 
From this formula follows a similar 
relationship for definite integrals: 

b b 

j / (x) dg (x) = / (x) g ( X ) £ — j g ( X ) df {x) 

a a 

b 

= lf(b)g(b)-f (a) g (a) ] - j g (x) df (x) . 

(5.4.3a) 

Here are some examples illustrat- 
ing the method of integration by parts. 

1. Find j xe x dx. Put f = x. Then 
w — dg/dx — e x and e x dx = dg , g — 
[ e x dx = e x , and df=dx. By formula 
(5.4.3) 

^ xe x dx = xe x — j e x dx = xe x — e x 4- C 
= e x (x — 1) + C. 

2. Find j x 2 e x dx . Put f = x 2 . Then 

w = dgldx = e x , e x dx = dg, g= j e x dx = 
e x , and df = 2xdx . By formula (5.4.3), 

j x 2 e x dx — x 2 e x — 2 j xe x dx , 

whence, using the result of the first 
example, we get 

j x 2 e x dx = x 2 e x — 2xe x + 2e x + C 

= {x 2 -2x+2) e x +C. 

To find | P n (x)e kx dx, where P n (x) 

is a polynomial of degree n, we have 
to perform integration by parts n 
times. We then get Q n (x) e kx , where 
Q n (x) is a polynomial of degree n . 


Knowing this, we need not perform 
integration by parts n times, but can 
write down directly the coefficients of 
the polynomial Q n (#). 

Let us take the same Example 2. Find 

\^x 2 e x dx. We write the equation j x 2 e x dx = 

Q 2 (x) e x + C with the (still) unknown coef- 
ficients of the polynomial Q 2 (^): 

j x*e*dx = (a i x*+aix-{-a 0 )ex+C. (5.4.6) 

Taking the derivatives of both sides of (5.4.6), 
we get 

x 2 e x — (2 a 2 x + a x ) e x + (a 2 x 2 + a ± x + a 0 ) e x . 

or 

x 2 e x = [x 2 a 2 + x (2 a 2 + a x ) + (a ± + a 0 ) ] e x . 

Equate the coefficients of identical powers of x 
in the polynomials on the right and left to 
get a 2 = 1, 2 a 2 + a x = 0, a x + «o = 0, whence 
= —2 and a 0 = 2. Finally as before, 
we get 

j x*exdx = {x* — 2x+2)e* + C. 

By a similar technique we can find the in- 
tegrals of the functions P n (x) cos kx and 
P n (x) sin kx, with P n (x) a polynomial. 
In both cases the answer is of the form 
Q n (x) cos kx + R n (x) sin kx, where Q n (x) 
and R n (x) are polynomials of degree n 
(or less than n). Examples of this kind are 
given in the exercises below. 


Exercises 


Find the following integrals: 


5 . 4 ... j 


x cos x dx. 5.4.2. J 
5.4.3. J x 2 sin 2x dx. 5.4.4. j 
5.4.5. J {x 2 -\-x + 1) cos x dx. 5.4.6. j (2a; 2 + l)X 


In x dx. 
x z e~ x dx. 


cos Sx i dx. [Hint. Exercise 5.4.6. is|studied 
in detail in Hints, Answers, and Solutions 
at the end of the book; Examples 5.4.3 to 
5.4.5 are handled similarly to Example 5.4.6] 


5.4.7. j 
5.4.9. j 


arc sin x dx. 5.4.8. 
e 2x sin 3x dx. 5.4.10. 


arctan x dx. 
e x cos 2x dx. 
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5.5 Change of Variables in Integration 

We have already mentioned the gen- 
eral method of changing the variables 
in integration, or integration by sub- 
stitution (of a new unknown for the initial 

unknown). If in the integral j* F (x) dx 

the integrand F (x) can be represented 
in the form 


Tm the. ijitajgyd* ^ - + ha. 

J V ( a 2 — x 2 ) 3 

appropriate change of variable is x = 
a cos t , that is, t = arccos {x!a)\ then 
dx = — a sin t dt , a 2 — x 2 = a 2 (1 — 
cos 2 £) = a 2 sin 2 t, and we obtain 

P dx P — a 2 sin t dt 

J Y (a 2 — x 2 ) 3 J y (a 2 sin 2 t) 3 


F(x) = f (cp (x)) 9' (x),5 

where cp (x) is a function of variable x , 
we can denote cp {x) by a single letter z 
and take z as the new variable. Since 
cp' (x) dx = dz , we arrive at the inte- 
gral j/ (z) dz, which may prove to be 

tabulated (that is, it may be found 
among the integrals of Section 5.2) or 
even be simpler than the initial inte- 
gral. This general method is widely 
used in integral calculus. 

For instance, the integration of many 
functions involving radicals and trig- 
onometric functions may be reduced, by 
means of an appropriate change of 
variable, to the integration of polyno- 
mials or algebraic fractions with integral 
powers. Let us consider several exam- 
ples. 

Find j* xY x -f 1 dx. We make a change 

of variable: z = Y x - |- 1 , # + 1 = 
z 2 , whence x = z 2 — 1 and dx = 2z dz . 
Passing to the new variable in the in- 
tegral, we obtain 

j xY x+ idx= j(z 2 — i)z2zdz 
= 2 j (z 4 — z 2 ) dz = 2 — 2 -^- + C 


In the integral 


j^f^ thea pp r °p- 


riate change of variable is x = at , 
that is, t=x/a: ; then dx = adt and we 
have 


f dx _ P a dt r a dt 

J a 2 + x 2 J a 2 + a 2 t 2 ~ J a 2 (l-H 2 ) 

= — i TXTF 11=7 — arctan t + C 
a J l + £ 2 a 

=■ — arctan — + C. 
a a 


\ 


a sin t dt 
a 3 sin 3 t 


_L f dt 

a 2 J sin 2 1 


= -4-coU-j-C = ' 
a 2 1 


•cot ^arccos + C. 


This expression can be simplified still 
further (see Exercise 5.5.10). 

An almost similar substitution can 
be applied to the “almost tabular” 

integral [ — \ which by the sub- 

J y a 2 — x 2 \ 

stitution x= az, dx = a dz is reduced 
to the known result \ — — 

J j/T —z 2 

arcsin z+ C j . Indeed, if x = asint r 

dx = a cos tdt, and ]/ a 2 — x 2 = acost f 
then 


f - dx = f acos *^ = [dt = t + C 

J Y a?— x* J a cost J 

= arcsin — + C, 


since t = arcsin (x/a) . 

i /t ‘ dx 

7 can be evalu- 

y a 2 -j-x 2 

ated in a similar manner, only in- 
stead of trigonometric functions we 
must employ the so-called hyperbolic 
sine sinh t = (1/2) ( e * — e ~ f ) and the 
hy perbolic cosine cosh t — (1/2) (e l -j- 
e~ l ) (for more details on these func- 
tions see Section 14.4). We can easily 
see (verify this!) that cosh 2 t — sinh 2 t= 
1, (sinh t)' = cosh t , and (cosh t)' = 
sinh t. Therefore, if we put x = 
a sinh t, then dx = a cosh t dt and 
]/ a 2 + x 2 = a cosh t . Thus, our integral 
takes the form 
f dx _ f a cos h t dt 

J y'a 2 + x 2 ~ J a cosh 1 

= j dt=t-\-C — arsinh 
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where arsinh z is the inverse hyperbolic 
sine defined by the condition: if t — 
arsinh z, then z = sinh t. 

We leave to the reader the task of 
bringing this example to the final re- 
sult (see Exercise 5.5.11). 

Finally, we give an example of an 
integral that cannot be represented in 
terms of a finite number of elementary 
functions. The integral is 

/ ( x ) = j e~ xZ dx. (5.5.1) 

The proof that it cannot be expressed 
in terms of a finite number of elementary 
functions (power functions, including 
functions with fractional powers, the 
exponential function, the logarithm, the 
trigonometric functions, and the in- 
verse trigonometric functions) is ex- 
tremely complicated and we will not 
give it here. 

The integral (5.5.1) constitutes a 
function whose properties can be stud- 
ied. The definition of / (x) implies 
that 



Since /' (x) = e~ xZ is positive for ar- 
bitrary x, it follows that / (x) is an 
increasing function. The derivative of 
/ (x) has a maximum at x = 0, where 
it is equal to unity, which means that 
/ (x) has a maximum angle of the tan- 
gent line with the x axis at x = 0 
(the angle is 45°). For large absolute 
values of x (positive or negative), the 
derivative df/dx is very small, whereby 
the function is almost constant for such 
values. The graph of the function 

® (x) = j e~ x2 dx is depicted in Fig- 
o 

ure 5.5.1 (for the sake of definiteness, 
the lower limit of integration has been 
chosen to be equal to zero). The fact 
that <D (oo) ~ 0.885 is equal to ]/ jt/2 
is given without proof in this book. 

Extensive tables have been compiled 
for the function <D (x), and so com- 
putations involving integral (5.5.1), are 



Figure 5.5.1 

no more complicated than, say, those 
involving trigonometric functions. 

Note that often, when using various- 
techniques, one obtains distinct ex- 
pressions for one and the same integral 
(see Exercise 5.5.6). This should not 
dismay the reader. If the computa- 
tions are correct, such expressions should 
differ by a constant only. The re- 
sults are identical when evaluating a 
definite integral (for any limits of in- 
tegration). 

We see that computing an integral 
is much more complicated than find- 
ing derivatives: here one must employ 
various ingenious techniques (integra- 
tion by parts, change of variable), and 
often it is not clear at first glance what 
technique to apply (e.g. how to choose 
the new variable). Moreover, in some 
cases (as in the case with integral (5.5.1)) 
there is not a single technique that 
enables finding the integral in closed 
form, that is, expressing it in terms of 
a finite number of elementary functions, 
and there is not a single general meth- 
od that could establish whether a giv- 
en integral can be expressed in such 
a form. 5 - 3 Integration techniques can- 


5 * 3 Often we encounter the case where the 

b 

formula for a definite integral j* / (x) dx 

a 

with known limits of integration a and b 
exists, but the general formula for the corre- 
sponding indefinite integral cannot be found. 

oo 

For instance, as noted earlier, j e~ x2 dx = =• 

o 

°o m 

y ji/2, where \ e~ x2 dx = lim l e~ x2 dx , 

J M-+oo J 

0 0 
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■not, in principle, be reduced to algo- 
rithms, while with differentiation tech- 
niques (i.e. finding the derivatives) this 
is possible. Often it is easier to use 
special tables when one has to find in- 
tegrals of functions. Such tables of in- 
tegrals contain lists of (indefinite) in- 
tegrals classified according to the type 
of integrand and other characteristics 
that enables finding the sought inte- 
gral easily, as well as certain types of 
definite integrals (with limits of in- 
tegration 0 and oo, 0 and jt, — oo and 
oo, etc.). 5 - 4 When an integral cannot 
be evaluated, one has to resort to (ap- 
proximate) numerical integration. 


Exercises 

Find the following integrals: 

5.5.1. j* cos (3# — 5) dx. 5.5.2. j sin (2a: + 

i) dx. 5.5.3. [\^3x — 2 dx. 5.5.4. [ — — — 

J J x+V x' 


{Hint, Make the substitution Y x=z.] 

J x dx 


[Bint. Make the substitu- 


tion a: 2 — 5 = z.] 5.5.6. ^ sin 3 x cos x dx. [Hint. 

Make the substitution sina: = z or cos x = u.] 

.5.5.7. f C0S ^- 
J sin 4 x 

tution sin x = z.] 5 


dx. [Hint. Make the substi- 
.5.8. j tan x dx. 


f dx 

5 - 5 - 9 - J 


5.5.10. Prove that 

+ C. 


C dx 

J Y (a 2 — £ 2 ) 3 


a 2 Y a 2 — t 


while the corresponding indefinite integ- 
ral cannot be calculated. (One method of eva- 
luating a definite integral that does not rely 
on formulas for the corresponding indefinite 
integral is discussed in Section 17.3.) 

5 - 4 See, for instance, I.N. Bronsthein and 
K.A. Semendyayew, A Guide-Book in Mathe- 
matics, Deutsch, Frankfurt, 1971; H.B. Dwight, 
Mathematical Tables , McGraw-Hill, New 
York, 1941; and I.S. Gradshteyn and I.M. Ry- 
zhik, Tables of Integrals , Series and Products , 
4th ed., Academic Press, New York, 1966. 


5.5.11. Find the integral 


5.5.12. Find the 


integral 


P dx 

J V a 2 + a: 2 * 

dx 

(x 2 + a 2 ) 2 * 


5.6 Change of Variable in a Definite 
Integral 

We consider an example. Let it be 
required to calculate the definite 

Q 

integral J (ax + ft) 2 dx. We can do as 

v 

follows. First calculate the indefinite 
integral j (ax + b) 2 dx and then form 

the difference of its values for x = q 
and x — p. 

To compute ^ (ax + b) 2 dx we make 

a change of variable using the for- 
mula z = ax + b. Then dz = adx and 

j (ax+b)*dx = ± \ z*dz = z + ( ++. 
Therefore 

J(a* + i,)*<te = <2S±S|» 

V 

_ (a? + &) 3 — (ap + b) 3 
3 a 

However, it is possible to do other- 
wise. Let us determine how z will vary 
when x varies from p to q. Since z and x 
are related as z = ax + 6, it follows 
that as x varies from p to g, the new 
variable z will vary from ap -f- b to 
aq + b . Hence, 

q aq+b 

j (ax + b) 2 dx = —■ | z 2 dz 

V ap+b 

_ I aq+b __(aq-\-b) 3 — {ap + b) 3 

3 a \ap+b 3 a 

When evaluating integrals, it is con- 
venient to do just that way, that is, 
when making a change of variable, to 
find the new limits of integration at the 
same time. This will obviate a return 
to the old variable in the expression for 
the indefinite integral. 
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Let us consider some examples. 

1 

1. Calculate [ - , 0 dx . Note from 

,1 ( 2 — x ) 3 
6 

the start that the function (2 — x )~ 3 
assumes positive values as x varies from 

0 to 1, and therefore the integral is 
positive. At the same time, the de- 
nominator in this domain of integra- 
tion does not vanish, so that the inte- 
grand is finite throughout the domain. 
Make the change of variable 2 — x — 
y, with dx = —dy. Then y = 2 at 
x = 0 and y — 1 at x = 1, so that 

0 2 

On the right-hand side of (5.6.1) the 
limits of integration are given for y . 
The reader may wonder about the mi- 
nus sign in the last equation. Indeed, 
on the right and left we have integrals 
of positive functions, so why is the 
right-hand side of (5.6.1) positive? The 
point is that the lower limit of inte- 
gration on the right is greater than the 
upper limit. Since an integral changes 
sign upon interchanging the limits of 
integration, (5.6.1) may be written as 
follows: 

1 2 

f dx P dy 

J (2-z)3 ” J ~y*' 

0 1 

Now, in the right-hand integral, the 
upper limit exceeds the lower one and 
it is clear that the integral on the right 
is positive. The computation can now 
be completed with ease: 

2 

[ dy__ L 2 — - A 

J y* ~~ 2y* 8 + 2 “ 8 * 

l 


2. In Section 5.5 we considered the 

X 

function ® (x) = j e~ x2 dx .One often has 

o 

to deal with the function cp (a) = 

a 

j e~ kxZ dx , where i: is a constant, 
o 


We will show that there is a simple 
relationship between the functions ® 
and (p. 

In the expression for cp(a) make the 
change of variable kx 2 = t 2 . From this 
we find that ]/ kx=t, x=t/\/ k, and 
dx=(l/Yk) dt. It is clear that 
t — 0 at x = 0 and t = a Y^ x = a. 
And so we get 


cp ( a ) 


aV~k 

: j e -< 2 


dt 

Vic 


aV~k 


Vk 


l ) 


e~ t2 dt 


-V MaY ~ k) - 


Consequently, for an arbitrary value 
of the independent variable x , 


<p(*) = JLO (xVk). (5.6.2) 

If we have a table of values of the func- 
tion ® (#), it is possible to find the 
values of cp (x) for any value of k , and 
vice versa, knowing the values of cp {x) 
which correspond to a definite value of 
k, we can find, via (5.6.2), the value of 
® (z) for any z (in (5.6.2) we are 
forced to put z = xY &). 5 - 5 

3. In Chapter 3 we saw that the defi- 
nite integral has dimensions if the 
integrand and the limits of integration 
have. It is often convenient, however, 
to reduce the integral to a dimensionless 
form by taking all factors having di- 
mensions outside the integral sign. 
What remains to be done is to consider 


5 * 5 Note that more customary are not tables 

X 

of the values of 0(;z)= j e~ x2 dx, of which 
0 

we spoke at the end of Section 5.5, but 


those of the function (p (x) = 


‘Y ^ 


i 


r xZ/2 dx , 


which is closely related to O (x) (see for- 
mula (5.6.2), where in this case we must put 
k =1/2). The function cp(a;) plays a very 
important role in the more advanced section 
of mathematics, the theory of probability , 
of which we speak in brief in the Conclu- 
sion of this book. 
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(transform) a numerical (dimension- 
less) integral. Let us see how to trans- 
form an integral to dimensionless form. 
Suppose we have an integral 

b 

j / (x) dx, where, for the sake of sim- 

a 

plicity, we assume f (x) to be non- 
negative. Denote by/ max the greatest 
value of the function / (x) on the domain 
of integration and take it outside the 
integral sign: 5 - 6 



(5.6.3) 


It is obvious that in the last integral 
the integrand f {x)/f m ax is dimension- 
less, since / (#) and / max have the same 
dimensions. Let us pass to dimension- 
less limits of integration. To do this, 
make the change of variable 

z— X b Z-a , or x=a -b z (b — a)- (5.6.4) 

We see that z is dimensionless. Since 
dx = (b — a) dz and z = 0 at x = a 
and z = 1 at x = b, it follows that the 
integral on the right-hand side of (5.6.3) 
takes the form 

= — a) f dz. 

J / max J /max 

a 0 

(5.6.5) 

Set / {a -f- z (b — a))// ma x = <p (z). Then 
from (5.6.5) we have 

b 1 

f dx= (b — a) \ cp ( 2 ) dz. 

J /max J 


6 * 6 We have assumed that / (x) ^ 0 and 
/max ¥= 0. In the case of a function that 
changes sign or is negative, for / max we must 
take the greatest of the absolute values of 
function /, or | / | max . 


and, finally, 


j 


1 

/ ( x ) dx = / max (b — a) j <p (z) dz, 
0 


(5.6.6) 


1 

where j <p(z)dz is a dimensionless 
0 

quantity. 

If / (x) varies but slightly on the 
domain of integration, then / (^)// max — 1 
(but, of course, / (#)// max ^l, since 

/ (x) < /ma x) f and so (p (z) ~ 1 (but 
1 1 


cp (z) ^ 1) and I = j cp (z) dz ~ 1 j dz = i 
0 0 

(but /^l). Then the dimensionless 
1 


factor I = j cp ( 2 ) dz is of the order of 

0 

unity and the value of integral (5.6.3) 
is determined mainly by the product 

/max (^ a ) • 

Let us consider a simple example: 
the free fall of a body in the course 
of time t 0 . The distance of free fall 

to 

is equal to | v ( t ) dt. The acceleration 


0 

of the falling body is g, and, there- 
fore, the rate of fall, or velocity, 
is v (t) = gt (we assume that u(0) = 0 
and that the maximum velocity u max is 
attained at time t q, or z^max = gt 0 ). 
Note that the maximum here is not 
due to any decrease in velocity after 
t = t 0 but simply to the fact that times 
exceeding t 0 and corresponding to veloc- 
ities greater than gt 0 lie outside the 
domain of integration 0 ^ t ^ t 0 . We 
introduce the notation z = t/t 0 , cp (z) = 
v/v max = gt/gt 0 = t/t 0 = z. The result 
is 
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We see that in the given case the di- 

1 

inensionless factor 1= j qp {z) dz (whose 

o 

order of magnitude is generally equal 
to unity but whose magnitude is 
always smaller than unity) is equal 
to 0.5, which agrees fairly well with 
our rough estimate. 


5.7 Integrating Functions Dependent 
on a Parameter 


In Section 4.12 we discussed briefly 
the problem of differentiating func- 
tions dependent on a parameter. Here 
we wish to investigate the question of 
integration of such functions. The defi- 

b 

nite integral I = l / (x) c£x is a num- 


a 

her, but if the integrand depends on a 
parameter s , that is, f = f (x, s ), the 
number will depend on this parameter, 
too, that is, we will have I = I ( s ). 
For instance, it is clear that 
+i 

j (kx + k~ l ) dx = ^Y kx 2 - f y x j J 1 ^ 

-i 

1 k 

j' sin Ax dx = -jj- j sin y dy 

0 o 

1 / \ 1^ 1 — COS A jr n o\ 

= x (- C os!/) | Q = — , (5.7.2) 

1 

j (s+l)xMz = (s + l) = 1 
0 


for s> — 1. 


(5.7.3) 


We see that integral (5.7.1) depends 
on parameter k, integral (5.7.2) de- 
pends on number A, while integral 
(5.7.3) seems to depend on s , but this 
dependence is fictitious, since I ( 5 ) = 1 
is independent of s, that is, I ( 5 ) is 
a function of s that equals unity iden- 
tically. Example (5.7.3) is also in- 
structive due to the condition that s 
must be greater than — 1 (without this 


condition the result will be simply 
incorrect; we will return to this con- 
dition later). 

The definition (3.2.4) of a definite 
integral reduces an integral to a sum 
of a large number of small summands. 
If the integrand f (x) = / (x, s) (the 
function u ( t ) in the notation of for- 
mula (3.2.4) or the function y (. x ) in the 
notation of formula (3.2.6)) depends 
on a parameter s, all the summands 
will also depend on this parameter. 
Since the sum of a finite number of func- 
tions (in our case, functions of para- 
meter s) can be differentiated and in- 
tegrated term by term, the function 

b 

j /(x, s)dx — I(s) (5.7.4) 

a 


can also be differentiated and inte- 
grated with respect to s under the in- 
tegral sign: 

r ( s) = i j j (*’ s ) dx= .[ df{ d's ~ dx ' 

a a 

(5.7.5a) 

X T 6 

j I (s) ds = j £ ( / (x, s) dx~^ ds 

o o a 

b x 

= j JT j / (a:, s) cfejda:, (5.7.5b) 

a a 

where under the integral sign on the 
right-hand side of (5.7.5a) is the 
derivative of / (x, s) with respect to s 

calculated on the assumption that x 

is kept constant, what is known as the 
partial derivative of / (x, s) with re- 
spect to s (see Section 4.12). (To prove 
the validity of the rules (5.7.5a) and 
(5.7.5b) of differentiation and inte- 
gration under the integral sign, it 
suffices to replace the integral I (s) with 
the approximating “integral sum,” of 
the type used in formulas (3.2.4) and 
(3.2.6), to differentiate or integrate this 
sum term by term, and then pass to 
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the limit as the number of summands 
grows without limit). For instance, 

•si <**+n*>= j a -&±p<ix 

-1 -l 

l 

— j (x — hr 2 ) dx 

-l 

and 


P i i P 

j ( j sin^5ia:da:jd^= j ( j sin Xx dh'jdx 

a 0 0 a 


-i(- 


COS %x \X=P/x 
^ / h—(x/ x 

COS P 


<2# 


j **>• 


Difficulties may arise when the 
integral (5.7.4) is “improper,” that is, 
when either the function f (x) in the 
domain of integration becomes infinite 
or the domain itself is infinitely large 

oo M 

(here, say, [f(x)dx=lim f / (x) dx\ 

cf. (7.5.5)). For instance, for sC 0 
the integral (5.7.3) becomes an im- 
proper integral, since the function (s -f- 
1) x s for such values of x become in- 
finite at x = 0; for 0 ^ s > —1 this 
integral assumes a constant value equal 
to unity; and for s < —1 the integral 
itself becomes infinitely large. But if, 

oo 

say, I (s) = j / (x, s) dx is “normally 

a 

convergent,” that is, there exists a 
function F (x) that is greater than 

oo 

| / ( x , s) | and such that j* F (x) dx is 

o . 

finite, then we can deal with I (s) as 
if it were an ordinary, “proper,” in- 
tegral (for instance, formulas (5.7.5a) 
and (5.7.5b) remain valid for this in- 
tegral). For example, since |(sin tar)/# 2 |^ 

oo 

l/x 2 and j dxlx 2 = (— l/x) |“ = 1 < oo, 


we can state that I (A,) = j dx 

is a normally convergent integral and* 

say, V (X) = J dx = 

l 

00 

P cosTuz ^ 

J x X ' 

1 

If integral (5.7.4) is not normally 
convergent, the direct application of 
rules (5.7.5a) and (5.7.5b) may lead 
to mistakes (here we may even encoun- 
ter a case where the integral I (5) is 
finite, while, say, its derivative, 
constructed according to (5.7.5a),. 

b 

I' ( s)= j — • dx , is infinite, or the 

a 

other way round). 

For instance, we are sure that 

00 

j = In (x + s) = 00 (we assume 

0 ~ 

that s is positive). But if we find the 
derivative of the integrand with respect 
to 5, we arrive at the “finite” deriva- 
tive I' ( s ) of the “infinite” integral 


7 a * i J ax * 1 Q 

ds J x+s~ J dx 
0 0 


dx 



1 dx 
{x + s? 


1 

x-\- s 


X=oo 

x=0 


1 


s * 


To clarify the meaning of this para- 
doxical result, we proceed as follows. 
Let us “cut 08” the singularity in the 

OO 

S dx 

, that is, re- 

6 

place the integral with the (proper) 

N 

J dx 

, where N is 

0 

large. We then study the asymptotic 
behavior of I N ( s ) as N 00 . Obvi- 
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ously, 


N 

I N (s)=l^ T =\n(x+s) 


S 1 — s 2 


x=N 
\x= 0 


dx- 


*1 — s 2 


X=°° \ 

X=1 )• The 


= In (N + s) — In s = In (1 + N/s ) , 
whence 

1 ' N = “/v+7 T ’ 

which agrees with (5.7.4a) (since (s) 
is a proper integral): 

N N 

7 »< s >- .( i(i Ti)*" J 

0 0 


x + s 2 


dx 


result (5.7.6) can also be easily under- 
stood if we start by considering the 

oo 

defference I N (s 2 ) — I N (s 4 ) = l — — 

OO 

r dx 

J ^+ 5 i 
0 

Here is one more striking example 
dealing with a nonnormally convergent 
integral dependent on a parameter. 
Take the quantity 


1 


'x=N 


1 


1 


x-\ -s |x=0 N + s 


If we now send N to infinity, we will 
see that I N (s) tends to infinity and 
I' N ( s ) tends to —1/s, in complete 
agreement with the result we arrived 
at above. 

The above-described procedure of cut- 
ting off the singularities often helps 
understand the meaning of results re- 
lated to nonnormally convergent in- 
tegrals or even divergent integrals, re- 
sults that at first glance seem para- 
doxical. For instance, notwithstanding 

oo 

the fact that I (s) == j = oo, the 

o 

difference between the integrals I (s 2 ) 
and I (s T ) proves to be finite: 




-u 


dx 


X + S 2 x + Si 


) d 


dx 


OO 

=J 


$1 — s 2 

* 2 +(Sl + S2)* + SlS2 


dx 


= lni 
*2 


(5.7.6) 


| the last integral is normally conver- 
gent, since if, say, 0 


have 


s i — s 2 


X 2 + ( 5 1 + s 2) X + s l s 2 ' 


Si — s 


we 

and 


/( X)=j^fdx. (5.7.7) 

0 

It can be proved that I (1) =■ 

oo 

\ dx = jt/2; here we will not 

o 

discuss the validity of this neat rela- 
tionship. On the other hand, for X > 0 
we have 

0 0 

= j -^dt, = /(li = -f , 

0 

while for ^<0 we have 

I (/.) = j dx = j He LzIIU!) dx 

0 0 

= - j — ■■ l x X]x) dx = 


(at X = 0, obviously, I (0) = 0). Thus, 
although a small variation in X changes 
the integrand but little, the inte- 
gral (5.7.7) for some reason experiences 
a jump at X = 0: 

Jt/2 if X > 0, 

/(*,)=! 0 if X = 0, (5.7.8) 

— k/ 2 if X<z0. 
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This is due to the fact that, when we 
integrate over an infinitely large 
domain, even a small variation of the 
integrand / ( x ) may result in an appre- 
ciable effect, since j f t ( x ) dx — 

j f (x) dx= j Ifi (x) — f(x)\ dx and here 

we add small differences / 4 (x) — / (a:) 
“extended” over an infinitely large 
domain. 

In this case, too, cutting off singu- 
larities in the integral (5.7.7) may 
clarify the situation. Of course, the 

N 

I Sill 

— - — dx depends 

o 

on A, in a continuous fashion, since 
this function, I N (A), has a derivative: 

J' N (k) = j Jj- (1!^£) dx = ( cos kx dx 

o o 

(while if we apply formula (5.7.5a) to 

oo 

/ (A), we get /' (A) = j cos Xx dx , whose 
o 

right-hand side is not defined). 5 * - 7 But 
it is clear (in view of the very definition 
of the improper integral (5.7.7)) that, 
as oo, the value of I N (A) tends to 
the values defined in (5.7.8); for any N 


5 - 7 Compare this with the procedure, de- 

scribed in Section 16.3, of finding the derivative 

of sgn x, a function that is closely related to 
/ (A) (we can easily see that J(A)= (jc/2) sgn A). 



the value of I N (A) is positive for 
A > 0, negative for A < 0, and zero 
at A = 0. (Why?) If N is very great, 
the function I N (A) very rapidly passes 
from “large” positive values (close to 
jt/2) to values that are very close to 
— jt/2, performing in the neighborhood 
of point A = 0 a “rapid” jump (but, of 
course, remaining continuous at this 
point). But if N is relatively small, 
the passage of the function I N (A) from 
positive values to negative ones is much 
smoother (Figure 5.7.1). In the limit, 
when N oo, the transition from “es- 
sentially positive” to “essentially neg- 
ative” values of I N (A) occurs, so to 
say, over an infinitely small interval 
of values of A, that is, here the func- 
tion I (A) = lim I N (A) experiences a 

N-* CO 

jump (cf. Chapter 16). 



Chapter 6 Series. Simple Differential Equations 


6.1 A Series Representation 
of a Function 

Suppose that a function y (, x ) is de- 
fined exactly by a formula so complex 
that it is awkward for calculating the 
values of y = y (x). We pose the problem 
of constructing a simple and convenient 
approximate expression for the func- 
tion y (x) over a small range of the 
independent variable x , say, for val- 
ues of x close to a fixed number a . 

The definition of a derivative given 
in Chapter 2 may be written as fol- 
lows: 

y'(a) = lim , 

V x-+a x ~ a 

from which it follows that 

y(x) ~ y(a) + (x — a) y'(a), (6.1.1) 

where the approximate equality is “exact 
in the limit,” that is, the smaller the 
difference \ x — a | the greater the ac- 
curacy. This formula shows that for 
values of x close to a the variation of 
the independent variable by a (small) 
quantity x — a corresponds to a va- 
riation in the value of the function 
equal to y'(a)(x — a), that is, the 
formula fits the meaning of the deriv- 
ative as the rate of change of the func- 
tion at point x = a, or (dy/dx) x==a = 
y’(a). 

For large values of | x — a \ the ac- 
curacy of (6.1.1) becomes poor: the 
greater the difference | x — a \ the 
poorer the accuracy. Indeed, if we put 
Ay = y (x) — y(a) = y’(a)(x — a), 
we are assuming that the rate of change 
of the function everywhere between x 
and a is the same and equal to the rate 
y' (a) of change of the function at point 
a, while actually the rate y ' also changes 
in the interval between x and a. 
The exact formula that takes into, ac- 


count this change in y f (x) has the 
form 6 - 1 

X 

y (x) = y(a)+ j y' (t)dt. (6.1.2) 

a 

This formula is known as the -first , or 
linear* 2 approximation for function 

y (*)• 

Applying (6.1.1) to the derivative 
y'(x), we get 

y'(x) ~ y‘ (a) + (x - a) y (a). (6.1.3) 

Before proceeding, let us recall that 
y"(x) = dy’ldx is the second deriv- 
ative of function y (x) with respect to x , 
denoted also d 2 yldx 2 . It is obtained 
from y' just as y' is obtained from y. 
The third derivative y"' = dy"ldx is 
derived in a similar fashion. The fourth 
derivative is denoted y IV , the fifth y w 
or y( 5) , and so on. The derivative of 
order n , or the rath derivative, which is 
obtained by taking the derivative of 
the function y (x) n times in succession, 
is denoted by y^ or d n y/dx n (the sec- 
ond, third, and fourth derivatives are 
denoted by y\ y"' , and?/ IV , but not by 
*/ (2 \ y {3 \ and z/( 4 >. In the notation 
y( n) , the n is enclosed in parentheses to 
distinguish it from an exponent. 

Clearly, if x is measured in units of e i 
and y in units of e 2 , then the dimensions 
of the nth. derivative d n y/dx n is eje™. 
For instance, if y is distance (measured 
in centimeters) and x is time (measured 
in seconds), the second derivative (the 
acceleration) y" has the dimensions of 
cm/s 2 ; the dimensions of the increment 
Ay" of the acceleration are also cm/s 2 , 
which means that the rate of change 

6 - 1 Formula (6.1.2) follows directly from 

X 

I* \x 

(3.4.5a), since j y' (t) dt = y ( t ) I = y (x) — 
a 

y (a). Formula (6.1.5) is verified in the same 
manner. 

6 * 2 In the expression (6.1.1) for y, the quan- 
tity x appears only in the first power, in other 
words, y is a polynomial of the first degree in x. 
This relationship is termed linear because its 
graph is a straight line (see Section 1.3). 


12-0946 
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of the acceleration y"' — lim (Ay "/Ax) 
has the dimensions of cm/s 3 , and so on. 

Now let us return to the problem of 
the approximate expression of a func- 
tion. Formula (6.1.3) for the derivative 
is nothing but formula (6.1.1) ap- 
plied to y' instead of y. Now substitute 
the approximate expression (6.1.3) for 
the derivative y' into the exact for- 
mula (6.1.2). The result is 

X 

y (x) c^.y (a) + J [y r (&) + (£ — a) y" (a)] dt 

a 

= y (a) + (x — a) y' (a) -f- - j - " V* ( a ) • 

(6.1.4) 

This formula is approximate but is more 
exact than (6.1.1). In deriving (6.1.4) 
we took into account that the deriva- 
tive y'(x) is not constant, but the va- 
riation in y' (x) was considered only 
approximately: formula (6.1.3), which 
we made use of when deriving (6.1.4), 
assumes that y"(x) is constant (and 
equal to y"(a)), which is what gives 
us the linear dependence of y' on x. 
For y (x) the relationship is quadratic. 
Let us again check the dimensions in 
(6.1.4) by employing the standard dis- 
tance-time example, namely, if y is 
measured in centimeters and x in sec- 
onds, then the dimensions of y' and 
y" are, respectively, cm/s (velocity) 
and cm/s 2 (acceleration), while the quan- 
tities y (x) and y (a) have the dimen- 
sions of cm and the factors (x — a) 
and (l/2)(x — a) 2 have, respectively, 
the dimensions of s and s'\ whereby 
the dimensions of the left- and right- 
hand sides of (6.1.4) coincide (cm). 

Let us make formula (6.1.4) still 
more precise by taking into conside- 
ration that y" is not constant. We take 
advantage of the formula 

X 

y' (x) = y' (a) + j y" (t) dt, (6.1.5) 

a 

which is obtained from (6.1.2) by sub- 
stituting y' for y. Let us represent 
y" (x) using a formula of the type (6.1.1) 


applied not to the initial function y (x) 
but to the function y"(x ): 

y"(x) ~ y" (a) + (x — a) y'"(a) (6.1.6) 

and then substitute this expression 
into (6.1.5). From (6.1.5) and (6.1.6) 
we obtain 

y'(x) 

x 

^ y' (a) + J [y” (a) + (t — a) y'" (a ) ] dt, 

a 

or 

y' (x) ~ y' (a) -{-(x — a) y" (a) 

4 -^f(«). (6.1.7) 

Note that (6.1.7) is a formula of the 
type (6.1.4) written for y' (. x ). 

Now let us substitute the expression 
for y' (x) (6.1.7) into (6.1.2): 

V(x) 

X 

<^y(a) + j [y' (a) + y“ (a)(t — a) 

a 

+ dt 

= y(a)+y' (a) (x — a) 

(6.1.8) 

The general law becomes all the more 
obvious if we compare the expressions 
obtained above. In the crudest approx- 
imation we assume that for a small 
difference | x — a |, 

y (x) — y (a) (6.1.9) 

(this fact is obvious and does not re- 
quire knowing higher mathematics). 
We call this equation the zeroth appro- 
ximation. Then expression (6.1.1) is 
called the first approximation , expres- 
sion (6.1.4) the second approximation , 
and expression (6.1.8) the third approx- 
imation. Listed, they are 

y {x) ~ y (a) (zeroth appoximation) 
y(x) ~y (a) + (x — d) y' (a) 

(first approximation) 
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y(x)~y (a) + ( x — a ) y' (a) 

+£=f- »■<«> 

(second approximation) 

y(x)~y (a) + (x—a) y' (a) 

+ (X^fy" (a)+ i£^l y'" {a) 

(third approximation). 

It is now easy to imagine the aspect 
of the approximate formulas for y (x) 
if the approximation process is con- 
tinued: if we take into account that 
y"' is not constant, then y 1Y (a) will 
be involved and the expression for 
y (x) (the fourth approximation) will 
contain a term with (x — a) 4 . Each 
subsequent step in approximating y (x) 
yields an additional term with a higher 
power of x — a. One can expect that 
the more powers of x — a that enter 
into the formula, the more exact the 
formula is. Of course, what we have 
said is valid only for small \ x — a \ 
while for large | x — a | all the above 
formulas may prove invalid (see Sec- 
tion 6.3). 

The general formula for the nth ap- 
proximation has the form 

y (x)lcalyja) + Cjy'iaWXt— a) 

+ c 2 y"(a)(x — a) 2 + c z y"\(a)\ (x — a) 3 

+ . . . + c n ym (a) ( X - a) n , (6.1.10) 

which follows from dimensional con- 
siderations. Indeed, if y is measured 
in units of e 2 and x in units of e l9 then 
y( k ) ( a ) = (d h y/dx h ) x=a has the dimen- 
sions of e 2 /e h i9 so that in the approxi- 
mate expression for y (x) the term with 
y( h) (a) must contain a factor of di- 
mensions of e\, that is, the factor 
(x — a) k . Of course, these ideas do not 
enable us to calculate the numerical 
values of the coefficients c l9 c 2 , c 3 , . . . 
in (6.1.10). (We know that c x — 1, 
c 2 — 1/2, c s = 1/6; but what are c 4 
and all the other coefficients?) These 
coefficients can be calculated by the 
following method. 


We denote the right-hand side of 
(6.1.10) (a polynomial of degree n in 
variable x) by P n (x): 

Pn(%) = y {a) + C x y (a) (x — a) 

+ C 2 y"(a)[x — a) 2 
+ c 3 y'"(a)(x — af 

+ . . . + c n yW (a)(x — a) n . (6. 1. 11) 

From this it readily follows thatP n (a) — 
y (a). Compute the first derivative 
of (6.1.11): 

Pn (#) = c x y'(a) + 2 c 2 y"(a)(x — a) 

+ 3 c 3 y"'(a){x — a) 2 
+ . . . + nc n y (n) {a) (x — a ) n “ 1 . 

(6. 1.11a) 

This means that P’ n (a) = c x y’ (a). Now 
let us require that not only the value of 
the polynomiall(&AAi) at point x = 0 
coincide with the value y ( a ) of the func- 
tion y {x) at the same point , but also the 
rate Pn [a) of the variation at point x = a 
be the same as the rate y' (a) of the varia- 
tion of function y (. x ). For this to hap- 
pen, we must put c x = 1. Finding the 
first and higher-order derivatives of 
(6.1.11a) and replacing x with a in the 
resulting expressions yields 


Pn (a) = 2 c 2 y" (a) 


and 

P'n (a) = y" (a) 

at 

c 2 .= 

= 1/2, 

Pn 

(a) = 

6 c 3 y"' (a) 




and 

P'n 

(a) = y’" (a) 

at 

c 3 = 

= 1/6, 

p lY 

( a ) — 

24 C/ y v (a) 




and 

p IV 
r n 

(a) j/iv (a) 

at 

C 4 = 

= 1/24, 

pin) 
r n 

(a) = 

2-3-4 ... nyW (a ) 



and 

p ( n n) («) = ^> (a) 






at c n = 

1/(2 - c 

!-4 . 

..n). 


Thus, if we require that the values of 
n successive derivatives of the polynomial 
P n (x) at x = a coincide with the deri- 
vatives of the same order of the function 
y (#) at the same point (this requirement 
means a closeness between P n (x) and 
y (^)), this means that we must put 


12 * 
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c 1 = 1, c 2 = 1/2, c s = 1/(2- 3), c 4 = 
1/(2- 3-4), . . ., c n = 1/(2 - 3 . . . rc). 
Thus, 

(£) = p(a) + y' (a) (x — a) 

+ (x — a ) 2 

+ •■■+ )“• f 6 - 1 - 12 * 

We have arrived at the approxima- 
tion formula 


X 

y(x) = y(a)+ j y' (t) d (t — x) 
a 


x 

= y(a) + y’(t)(t-x) I*- j ( t-x)dy’(t ) 
a 


X 

= y (a)+y' (a)(x—a)+ j (x—t)y"(t)dt. 

(6.1.14) 

If we agairijSubstitute d(t — x) for dt and in- 
tegrate by parts, we get 


y(x)~y ( a) + y ' (a) (x — a) 

+ j Q? L (x-a)* +!^-(x-a)* 

+ + 2 .f^ -(^~a) n . (6.1.13) 

We have a convenient designation for 
the product of a succession of natural 
numbers 1-2-3. . .n. It is n\ and is 
read factorial n. For example, 3! = 
1-2-3 = 6, 4! = 1-2-3-4 = 24, 5! = 
1-2-3-4-5 = 120, etc., while 1!= 1 and 
2! = 2. Using the factorial notation, 
we can rewrite (6.1.13) as 

y (*) ~ y(a)+±-^-(x— a) 

+ i^ ( x_a)»+... + J^-(*-ar. 

(6.1.13a) 

Here is a fuller proof of formulas (6.1.13) 
and (6.1.13a), which also leads to results that 
are stronger than the ones obtained above. 
Let us return to the exact equation (6.1.2) 
and substitute d(t — x) for dt under the in- 
tegral sign: 

X 

y(x) = y(a) + ^ y’ ( t)dt 

a 

x 

= y( a )~ t" j* y' (t)d(t x). 

a 

This substitution is justified since the upper 
limit of integration x is regarded as a constant 
and therefore d(t — x) = dt. 

Let us now integrate by parts (see formu- 
la (5.4.3a)) to get 


X X 


j (x — t)y"(t)d(t—x) = 

- j y"{t)d { ^J?- 

a 

a 


X 

= y"«) ( *- t n x + 

a |a 

j {fL=Jl dy »( t) 


a 


X 

, ( x-a ) 2 , 1 

= V W 2 +2 

j (x—t) i y"'(t)dt, 


a 


whence 

y (*) = y (a) + y ' (a) (x — a) + y (x — a) 2 

X 

+ -J- j (*— t) 2 y'" (t) dt (6.1.14a) 

a 

Performing the integration by parts n times, 
we get an exact expression for y (x) consisting 
of n + 2 terms: 

y{x) = y(a) + y' (a) (x—a) 

+ £ w L( —‘ f+Tr"-* 1 ' 


X 

+-^j- J (*- i)V n+1 > (6.1.15) 

a 

This formula, in contrast to (6.1.13a), is 
exact, since it was obtained as a result of 
transformation of the exact formula (6.1.2). 

X 

The last term, r 7 i = ~j" j (x—t) n y( n+1 )(t)dt, 

a 

on the right-hand side of (6.1.15) is known as 
the remainder ; it is the difference between the 
y (x) in (6.1.13) and the approximate express- 
ion for y (z) expressed by the right-hand 
side of the same formula. 
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Formula (6.1.15) can be rewritten in a 
more convenient form that does not contain 
an integral: 6 * 3 

y (*) = V (a) + y ' (a) (X — a) ( x ~ a) 2 


+ ... 


ym (a) 
n\ 


(x — a) n -\- 


(x— a) n+1 
(n + l)\ 


y < n+1 > (c), 


(6.1.16) 

where c is a number lying between x and a. 


In the general case of an arbitrary 
function y (x), no finite number of pow- 
ers oi x — a with constant coefficients 
can yield an absolutely exact formula 
for the initial function. 6 - 4 In other 
words, the right-hand side of (6.1.13) 
or (6.1.13a) only approximately coin- 
cides with y (x), but is not equal to 
y (x) exactly. An exact formula can be 
obtained only by adding an infinite 
number of powers of x — a: 

y (%) = c 0 + (x — a) + c 2 (x — a) 2 
+ c 3 (x — a) 3 + . . . 

+ Cn (# — a) n + * . *. (6.1.17) 

The expression on the right-hand side 
of (6.1.17) is called an infinite series . 
Ordinarily, we drop the word “infinite” 
and simply say “series”. Of course, 
we can also say that (6.1.17) is true 
“in the limit”, but this means some- 
thing different than it did above. The 
crux of the problem is that we cannot 
add directly an infinite number of 
terms; thus, in writing out the sum 
on the right, we are forced to restrict 


ourselves to a finite number n of the 
first terms in the series. The statement 
that for sufficiently small | x — a | 
formula (6.1.17) is “exact” means that 
the sum on the right can be made as 
close to y (, x ) as desired (what is required 
is a sufficiently large number n of terms). 
We can also say that y {x) is the limit 
of the sum of the first n terms of the 
series as n oo. 6 - 5 . 

The coefficients c 0 , c l9 . . ., c n , . . . 
in (6.1.17) are different for different func- 
tions; they also depend on point a. 
As we have seen earlier, c 0 = y(a) 9 
c ! = /(a), c 2 = y"(a)/ 2!, c 3 = y »/3!, 
etc., that is, 

y(x) = y (a) + y' (a) (x —a)+ y ~- (x—a) z 

+ ^(x-a ) »+... 

-f ^t( X —a) n + .... (6.1.18) 


A particular case of (6.1.18) where 
a = 0 is 

y(x) = y (0) + y' (0) x + x z 

+ (6.1.19) 


Using the summation sign, we can 
write formulas (6.1.18) and (6.1.19) 
in compact form as follows: 

77=00 

V(x) = v(a)+ 2 ( x — a ) n > 

77=1 

(6.1.18a) 


6 * 3 As for the validity of the transition 
from (6.1.15) to (6.1.16), see the text in small 
print at the end of Section 7.8, in particular, 
formula (7.8.13), in which the integration 
variable x must be replaced with x — t — z 
and the function / (x) with y ( n+1 ) ( t ) (which 
depends on the new variable x — t — z; we 
remind the reader that d (x — t) = — dt, 
since x is regarded as constant). We then get 

X 

j (x-t ry<n+l) [t) dt = J jL Zjig t l ?( n + i )(c)> 
a 

where c is a value of the independent variable 
lying between x and a. 

6 ; 4 Except for the case when y (x) is a poly- 
nomial (see the beginning of Section 6.2). 


y(x) = y(0)+ 2 (6.1.19a) 

77=1 

These two formulas (as well as (6.1.18) 
and (6.1.19)) yield the expansion of 
a function y (x) in a series of integral 
powers of x — a and x , respectively. 
The series on the right-hand side of 

6 * 5 It may so happen that for different val- 
ues of x the accuracy with which the sum of the 
first n terms approximates y (x) depends on n. 
(We will not dwell on the general questions 
of the applicability of formulas of type (6.1.17) 
and of the functions for which the expansions 
(6.1.18) and (6.1.19) prove to be invalid.) 
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(6.1.18) or (6.1.18a) is called a Taylor's 
series , while the particular case of a 
Taylor’s series, present on the right- 
hand side of (6.1.19) or (6.1.19a), is 
known as Maclauriri' s series . 

Examples of applying Taylor’s and 
Maclaurin’s series to concrete func- 
tions are discussed in Section 6.2, but a 
simple example of representing func- 
tions in the form of series will be con- 
sidered here. 

Suppose that y = e x . Then 
y = e x , y" = e x , . . y^ = e x , . . .. 


x is close to a, while formula (6.1.21a) 
is good when x is close to zero. 

In Chapter 2 we gave a definition of a de- 
rivative as the limit of the ratio of the incre- 
ment of the function to the increment of the 
independent variable. Now that we have ex- 
pressed the function as a series, we state gen- 
erally the law according to which the ratio 
Ay/ Ax approaches dy/dx as Ax tends to zero. 

Let us take Taylor’s series (6.1.18) and 
denote x — a by Ax. We then shift y (a) to 
the left-hand side and divide both sides by 
Ax , denoting y (x) — y (a) by Ay. The result 
is 

-jjf - = V' (a)+~y"(a) Ax 


We take advantage of the formula 
(6.1.19) for a Maclaurin’s series. In the 
case at hand, 

y( 0) = y' (0) = y" (0) = . . . = l. 

Substituting into (6.1.19), we arrive 
at the expansion of the function y = e x 
in a Maclaurin’s series: 


^l + *+X + -T+-'- + ^ r 


( 6 . 1 . 20 ) 


Let us now examine the formula ob- 
tained from the Taylor expansion 
(6.1.18) if we confine ourselves to, 
say, three terms in the series: 


y\(x)\~ yl(a) + {x—a) y' y" (a). 


Removing brackets on the right-hand 
side and arranging the result in increas- 
ing powers of x, we get 

y Mb* [y'M) — ay'M) + -y-V'j(a)] 

+ [y'da) — ay" (a)]]x + ^y' i a )\ x 2 - 

( 6 . 1 . 21 ) 

On the right is a polynomial of de- 
gree two. Note that this polynomial does 
not coincide with what we would have 
if we took three terms in Maclaurin’s 
series: 

y\(x) ~'i/(0) + i/' (0)x+V-fp x 2 . (6.1.21a) 


+{f( 4 )(Ai)»+.... (6.1.22) 

For small Ax , the second term on the right- 
hand side, containing Ax, is greater than the 
third. Discarding the latter, we conclude that 
the difference of the ratio Ayl Ax from the 
value of the derivative y' (a) at the endpoint a 
of the interval extending from x = a to x = 
a + Ax is proportional to the second deriva- 
tive y" (a) and the size of the interval, Ax: 

-^y’ (a)+~y"(a) Ax. 

Note that here we compare the ratio of the 
increment of the function on the range of 
variation of x considered in the problem to the 
increment of x with the value of the derivative 
y'(a) at the endpoint a of this interval. 

The derivative may be evaluated differently 
by assuming that [/ (x + Ax/2) — f (x — 
Ax/2)]/2 is its approximate value rather than 
[/ (x + Ax) — / (x)]/2. In other words, to 
calculate the derivative at point a, we take 
the increment of the function as x varies from 
a — Ax/2 to a + Ax/2 and divide it by Ax. 
Let us now compare this new ratio with the 
derivative y' (a) evaluated at the midpoint 
of the interval. We get 

A J/ = /(a+^)-/ (a~4f) , 
f («+4n) — /( a ) + ^r /'(<*) 

It) f (a)+ Tl~) f (a) ’ 

/ (a — ^-) ^ 




2 

6*2 



This vill become clear if we recall that , , (Ax) 2 . 

formula (6.1.21) yields a good result if ~ * ' a ' ‘ 24 * 
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We see that the new method of estimating 
/' (a) is much more exact: the difference be- 
tween the ratio of the increments and the de- 
rivative is proportional to (Ax) 2 and not to 
Ax and, what is more, contains the small coef- 
ficient 1/24. Thus, approximately, 


f'(a) ~ 


f (a+Ax) — f (a) 
Ax 


(6.1.24) 


while the more exact formula follows from 


( 6 . 1 . 22 ), 

f { a) ~ MMzIM 


Y f" (a) Ax. 


(6.1.24a) 


On the other hand, 

r {a) « /(a + Aa;/2) - /(a - Aa/2) . (6.1.25) 


while the more exact formula (the one that 
allows for all four terms) is 

„ . , ^ / (a + 2A x) — 2/ (a+ Aa) + / (a) 

1 W ' (A*) 2 

—~f"'(a) Ax. (6.1.26) 

To obtain /" (a), it is best to combine not 
the values / (a), / (a + Ax), and / (a + 2 Ax) 
of the function / (x) calculated on the interval 
between a and a + Ax but the values / (a — 
Ax ), / (a), and / (a + Ax). (Here the value of 
the independent variable x — a coincides with 
the midpoint of the range of variation of x.) 
Indeed, it can be easily shown that 

. o / (a + As) — 2/ (q)+/(q- Ax) 

t W-d (Ax)* 


while the more exact formula follows from 
(6.1.23), 

r (a) „ f(a + *x/2)-f(a-Ax/2) 

-—/'"(a) (Ax)K (6.1.25a) 

All these equations, (6.1.24), (6.1.24a), (6.1.25), 
and (6.1.25a), are of course approximate, 
since, we Jija;ve J o^nred^higji.errord£r- rryaniities- 
in comparison to those that enter into these 
formulas (with respect to Ax). 

In a similar manner we can obtain appro- 
ximate formulas for calculating the second de- 
rivative. By definition, 

J dx Ax * 

Replacing A/' with /' ( a + Ax) — /' (a) and 
estimating /' (a + Ax) and /' (a) via the ra- 
tios [/ ( a + 2 Ax) — / (a + Ax)]/ Ax and [/ (a+ 
Ax) — / (a)]/ Ax , we arrive at an approximate 
expression for the second derivative of the 
function that incorporates [/ (a + 2 Ax) — 
2/ (a + Ax) + / (a)]/ (Ax) 2 . To estimate this 
ratio, we again turn to Taylor’s series: 

/(«+ 2Ax) -/(«) + !/' (a) (2Ax) 

+ |/" (a) (2Az) 2 -f-L f (a) (2Ax)\ 
f (o+ Ax) ~ f\(a)+ y /' (a) Ax 

+ \f' (a) (Ax)*+y£ V " {a) (hX? ' 

Using the first three terms on the right-hand 
sides of these formulas, we obtain 

f (a-\- 2Ax) — 2/ (a-\-Ax)-\-f (a) _ 1 

(Ax) 2 ~ 3 ! yh 


or, more precisely, 

x o f(a + Ax)~ 2/ (a) + /(a— Ax) 

f (a) ^ 3 ( A ^ 5 

~i /IV(a)(A * )4 (6 - 1-27) 

(see Exercise 6.1.6). 

Geometrically, the differences between the 
approximate formulas (6.1.24) and (6.1.25) 
p ior ‘ Icne Aieriv Airve 1 lie : Tu * i a 1 1 t : Tu 
first case we replaced the tangent t to the graph 
of the function y = / (x) (Figure 6.1.1) at 
point M (a, / (a)) (the slope of the tangent 
line at this point is equal to /' (a)) with the 
straight line MM ± = Z, where M 1 = M x (a + 
Ax , / (a + Ax)), while in the second case t 
replaced with the straight line N ± N 2 = l x , 
where N 1 = N 1 (a — Ax/2, f (a — Ax/2)) and 
N 2 = N 2 (a + Ax/2, / (a + Ax/2)). It is clear 
that the straight line l ± is much closer in di- 
rection to the tangent line t than the straight 
line Z (see Figure 6.1.1). In a similar manner, 
formula (6.1.26) is obtained on the assump- 
tion that the rate of change of derivative y' 
(the rate with which the tangent to the curve 
y = / (x) rotates as point M moves along the 
curve) is estimated through the differences in 
variation of the derivative from point M 
to point M x and from point M x to point M 2 (x-{- 
2 Ax, f (x + 2 Ax)), while for formula (6.1.27) 
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the rate of change of derivative y' is estima- 
ted through the differences in variation of the 
derivative from point M' ( x — Ax , f (x — 
Ax)) to M and from M to M 1 (make a rough 
sketch). 


Exercises 

6.1.1. Expand the third-degree polynomial 
y = ax 3 + bx 2 + cx + d in a series in powers 
of x — x 0 , where x 0 is an arbitrary fixed num- 
ber. Compare the sum of the first two, three, 
and four terms of the expansion obtained with 
the polynomial. 

6.1.2. Expand the function y = xe x in 
a Maclaurin’s series. Verify that the expansion 
can be obtained from the expansion of e x mul- 
tiplying the latter by x. 

6.1.3. Expand the function e x in a Tay- 
lor’s series in powers of x — 1. 

6.1.4. Determine the accuracy of the ap- 
proximate formula (1 + r) m ~ e mr for r< 1 
and mr < 1. To do this, write the left- and 
right-hand members approximately as the 
sum a + br + cr 2 , obtained by ignoring the 
third and higher powers of r. 

6.1.5. Given the interval Ax = 1, 1/2, 
1/4, and 1/8 and using formulas (6.1.24) and 
(6.1.25) (or (6.1.26) and (6.1.27)), respectively, 
find the approximate values of (a) the deriva- 
tive of e x at x = 0 and (b) the second deriva- 
tive of e x at x = 0. Compare the results with 
the exact values. 

6.1.6. Prove formula (6.1.27). 

6.2 Computing the Values 
of Functions by Means of Series 

Let us dwell briefly on the principles 
underlying the formulas of Section 6.1. 
When we began the study of the deriv- 
ative, or the rate of change of a func- 
tion, we assumed the function as known, 
that is, we proceed from the fact 
we could compute the value of the 
function for any value of the indepen- 
dent variable. This is why, when we 
considered derivatives, we found them 
directly, empirically so to say, by 
computing the values of the function 
for close-lying values of the independent 
variable. Later on we learned how to 
find derivatives by formulas and it 
turned out that setting up formulas for 
derivatives is a rather simple matter. 
And so finding the values of a func- 
tion by means of a formula involving 
derivatives (all formulas of Section 6.1 
were of this type) turns out to be even 


simpler than a direct computation of 
the function. 

Since only in the case of a polynomial 
does a Taylor’s series terminate, or 
contain a finite number of terms, it 
follows that any function different from 
a polynomial will be represented by an 
infinite series (this statement will be 
proved somewhat later). The practical 
value of such infinite series for com- 
putational purposes is due to the pos- 
sibility of confining ourselves to a few 
(commonly two or three) terms of the 
series in order to obtain a sufficiently 
accurate result. This requires, of course, 
that the discarded terms of the series 
be small. 

Let us consider a few very simple ex- 
amples of series expansions of functions. 
We have already mentioned the (simple) 
example when the function 

V — b 2 x 1 + . -f- b n x n (6 2 1) 

is a polynomial of degree n. Its deriv- 
ative y'(x) is a polynomial of degree 
n — 1, the derivative y"(x) is a poly- 
nomial of degree n — 2, and so on; 
finally, its derivative y( n) (x) is a con- 
stant, while z/( n+1 > (x) and all higher- 
order derivatives are identically zero. 
For this reason, for a polynomial the 
corresponding Taylor’s series termi- 
nates; in other words, it contains a 
finite number of terms. We have ar- 
rived at the same polynomial, but in 
the form of a Taylor expansion in pow- 
ers of x — a: 

y = c 0 + c l (x — a) + c 2 (x — a) 2 
+ . . . + c n (x— a) n . (6.2.1a) 

For polynomials of nth degree, the sum 
of the first n + 1 terms in the Taylor’s 
series yields the exact expression valid 
for all values of x rather than only for 
values close to a , while formula (6.2.1) 
can be regarded as Maclaurin’s series 
(6.1.19) for our function y(x). (Why?) 

In the preceding section we considered 
the exponential function y = e x and 
obtained formula (6.1.20), 
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TABLE 6.1 



e x 


*2 


3C2 0C3 x 4 

X 

1 

*+*+ — 

1 +*+~r + V 

i +*+— + — + -24- 

0.10 

1.1052 

1.10 

1.1050 

1.1052 

1 .1052 

0.25 

1.2840 

1.25 

1.2812 

1.2838 

1.2840 

0.50 

1.6487 

1.50 

1.6250 

1.6458 

1.6484 

0.75 

2.1170 

1.75 

2.0312 

2.1015 

2.1147 

1.00 

2.7183 

2.00 

2.5000 

2.6667 

2.7083 

1.25 

3.4903 

2.25 

3.0312 

3.3568 

3.4585 

1.50 

4.4817 

2.50 

3.6250 

4.1876 

4.3986 

2.00 

7.3891 

3.00 

5.0000 

6.3333 

7.0000 


In particular, substituting x = 1 and 
x = — 1, we get expressions for num- 
bers e and e _1 in the form of series: 


1 J r 1 + 4- + 1T + • • * + ~T+ • • •» 


± = l_i + ±_i + !_JL 

* '2 6 r 24 120 


+ 


1 


1 


(2n)! ( 2/2 + 1)! 


(6.2.3) 

... (6.2.3a) 


Formula (6.2.2) enables us to com- 
pute e x rapidly and to a high degree of 
accuracy, as can be seen from Table 6.1 

If at x = 0.1 we confine ourselves 
only to the first two terms, the error 
will not exceed 0.5% of the true value; 
at x = 0.5 the first three terms yield 
an error of 1.4%; finally, at x — 1.0 
the first four terms yield an error of 
1.8% of the function y — e x . 

Such a high accuracy is plainly due 
to the fact that the terms of the series 
(6.2.2) fall off rapidly. Each subsequent 
term in the series is less than the pre- 
ceding one primarily because the de- 
nominator of the ( n + l)st term is n 
times the denominator of the preceding 
nth term. If x < 1, then in addition 
we have that x n is the smaller the great- 
er n is, and x n rapidly decreases as n 
grows. But even if x > 1, the increase 
in the denominator in the distant terms 
of the series as n grows will inevitably 
overcome the increase in the numera- 
tor, since as we move from the nth term 
to the (n + l)st term the numerator 
increases ^-fold while the denominator 


increases ra-fold, and all we have to da 
is to wait until the increase in the de- 
nominator overcomes the increase in 
the numerator (that is, until n becomes 
greater then x). As can be seen from 
Table 6.1, when x = 2, the sum of 
the first five terms in the series yields 
an error of 5%. But if we add the sixth 
term, x b l 120, then we get 7.3500, which 
yields an error of only 0.5%. 

Let us construct formulas of the 
same type for trigonometric functions. 
For instance, for y (x) = sin x we have 

y' (#) r= cos x , y" (x) = — sin x , 
y w (x) = — cos x , y 1Y {x) = sin x 

The law for subsequent derivatives is 
obvious. Substituting x — 0, we get 

y (0) = o, y' (0) = 1, 

y"( 0 ) = 0 , y'" ( 0 ) = — 1 , .... 


Consequently, 

^7 

sm x=x g- +i 20 — 5040 


(6.2.4) 


In similar fashion we get the formula 


cosx = 



X 4 X*_ | 

24“ 720 "i” 


(6.2.5) 


Figure 6.2.1 shows the graphs of tho 
sine, cosine, and also the graphs of the 
polynomials if we take one, two, and 
three terms in the corresponding se- 
ries. Accuracy improves visibly when 
we take more and more terms of the 
series. 

Tables 6.2 and 6.3 list the values of 
the sine and cosine functions, respec- 
tively. In the tables both cp and x are 
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the tables that two or three terms of the 
series suffice to obtain excellent accura- 
cy in the interval from 0 to jt/4. Thus, 
a power series offers a very convenient 
practical method for computing the 
values of trigonometric functions. Note 
that in absolute value the nonzero terms 
of the series for the sine and cosine are 
exactly equal to the corresponding terms 
of the series for the function e x . For 
this reason, everything we have said 
pertaining to the falling off of terms 
with high powers of x in formula (6.2.2) 
for e x refers also to the series (6.2.4) 
and (6.2.5), and these series, just like 
(6.2.2), enable calculating the value of 
the corresponding function for an ar- 
bitrary x. 

It must be emphasized, however, that 
the fact that the terms in a series de- 
crease, as n -*■ oo, for an arbitrary x 
and, the more so, the sequence of “par- 
tial” (finite) sums tends to a definite 


TABLE 6.2 


X 


<p° 

sin x 

X 

*3 

X3 X 5 

3C_ IT + 120 

0 


0 

0.0000 


0.0000 

0.0000 

jt/20 


9 

0.1564 


0.1564 

0.1564 

3T/10 


18 

0.3090 

0.3142 

0.3090 

0.3090 

Sji/20 


27 

0.4540 

0.4712 

0.4538 

0.4540 

4:n;/20 


36 

0.5878 

0.6283 

0.5869 

0.5878 

5jt/20 


45 

0.7071 

0.7854 

0.7046 

0.7071 

&it/20 


54 

0.8090 

0.9425 

0.8029 

0.8091 

7it/20 


63 

0.8910 

1 .0996 

0.8780 

0.8914 

83T/20 


72 

0.9510 

1 .2566 

0.9258 

0.9519 

Dit/20 


81 

0.9877 

1 .4137 

0.9427 

0.9898 

3T/2 


90 

1.0000 

1.5708 

0.9248 

1.0045 

TABLE 6.3 






X2 x 4 

X2 X 4 x6 

X 

<p° 


cos x 

i_ i- 

2 + 24 

1 2 + 24 ~ 720 

0 

0 


1 .0000 

1 .0000 

1 .0000 

1 .0000 

n/20 

9 


0.9877 

0.9877 

0.9877 

0.9877 

n/10 

18 


0.9510 

0.9506 

0.9510 

0.9510 

3n/20 

27 


0.8910 

0.8890 

0.8911 

0.8910 

An 120 

36 


0.8090 

0.8026 

0.8091 

0.8090 

5ji/20 

45 


0.7071 

0.6916 

0.7075 

0.7071 

6ji/20 

54 


0.5878 

0.5558 

0.5887 

0.5877 

7jt/20 

63 


0.4540 

0.3954 

0.4563 

0.4539 

Sn/20 

72 


0.3090 

0.2105 

0.3144 

0.3089 

9ji/20 

81 


0.1564 

0.0007 

0.1672 

0.1561 

n/2 

90 


0.0000 

-2.337 

0.0200 

-0.0009 



angles, but cp is expressed in degrees, 
while x is the corresponding angle ex- 
pressed in radians. It is evident from 
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number as n oo, can in no way serve 
as a general rule: in certain respects the 
series (6.2.1), (6.2.1a), (6.2.2), (6.2.4), 
and (6.2.5) and more “safe” than the 
general series (6.1.18) and (6.1.19). Sec- 
tion 6.3 is devoted to this question. 


In particular, 6 * 6 


j* e <2 dt = x — 
0 


x h 

2dT 


+ **. +(~D n 


x 2n+i 

n\ (272 + 1) 


x 1 

F7 


( 6 . 2 . 10 ) 


Taylor’s and Maclaurin’s series find wide 
application in higher mathematics; some of 
these applications are discussed below (e.g. 
■see Section 6.6). Here we wish to discuss the 
problem of employing series in the integration 
of functions. If the function y — f (x) can be 
expanded in a power series, 

j (x) = a l) -\-a 1 x-{-a i x' i -\- . . . +a n x n + . 

( 6 . 2 . 6 ) 


Exercises 

6.2.1. Express the coefficients c 0 , c x , 
c 2 , ... of the series (polynomial) (6.2.1a) 
in terms of the coefficients b 0 , b x , & 2 , . . . 
of polynomial (6.2.1). 

6.2.2. It is known that the sine-integral 

function six = \ J^JUdx cannot be expressed 

J * 

by a formula explicitly. Represent si x as 
an infinite series. 


then, generally speaking (with certain reserva- 
tions discussed in Section 6.3; here we will 
not touch on these restrictions), 


j / (x) dx 

— ^ (a 0 J ra 1 x-\-a 2 x 2 -{- . . . +a n x n -{- . . .) dx 
~ C + a 0 x + x 2 + x 3 
+ ...+^* n+1 +.<*. (6.2.7) 


This result is valid regardless of whether the 
indefinite integral (antiderivative) j / (x) dx 

can be expressed by an algebraic formula or 
not. For instance, above we stated that the 

function ^e~ t2 dt cannot be expressed by a 

(finite) formula, but an infinite series (“for- 
mula”) can be employed to express it. Re- 
placing x with — t 2 in (6.2.2), we get 


f27l 


-whence 

j e~ t2 dt = C+t — 

+ ...+(-iy 


t* , t b 




3 1 10 42 

^ 271+1 


72 ! (272 + 1 ) 


( 6 . 2 . 8 ) 


(6.2.9) 


6.3 Cases Where Series Expansions 
Cannot be Applied. The Geometric 
Progression 

In the preceding section we set up 
formulas in the form of series in powers 
of x with constant coefficients for four 
functions: the polynomial, e x , sin x, 
and cos x. In all these cases it was 
found that for arbitrary x , each subse- 
quent term in a series, with the pos- 
sible exception of the first few terms, 
is less than the preceding one, and the 
greater the number-label of the term, 
the closer the term is to zero (while 
in the first, “safest,” case all terms 
starting from a certain term vanish). 
In these examples, we can compute the 
value of the function for any x by means 
of a series if a large enough number of 
terms of the series is taken so that the 
discarded terms have practically no 
effect on the result. 

To summarize, then, we began with 
the problem of approximating a func- 
tion in a small range of the variable and 
constructed more and more exact formu- 
las by taking into account the first, 


6 - 6 Formula (6.2.10) is applicable for any 
finite x\ however, it is clear that the important 

oo 

result^ e ~ l 2 dt =y nl 2 cannot be derived 

6 

from (6.2.10)— quite different methods are 
required to do this. 
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second, third, and higher derivatives. 
The accuracy of each formula, 

y (x) ~y (a), (0) 

y (x) ~y (a) + (x — a)y (a), (I) 

y(x) ~y(a) + (x—a)y (a) 

+ -^V(a), (II) 

etc., is the greater, the smaller the 
quantity | x — a |. Moreover, for a 
given x — a formula (I) is more accu- 
rate than formula (0), formula (II) is 
more accurate than formula (I), and 
so on. Hence, if we increase the number 
of terms of a series, this permits increas- 
ing the quantity | x — a \ while pre- 
serving a given accuracy. 

The question now arises as to whether 
it is always possible to attain a given 
accuracy for any value of | x — a \ 
merely by increasing the number of 
terms in a series. We will use an im- 
portant example to illustrate this point 
and will show that this is not so. A pow- 
er series constructed so as to yield a 
good approximation in a small range 
of x , that is, for small | x — - a |, can 
be useless in the region of large | x — a |. 
In other words, this series may have a 
natural limit of applicability, the limit 
of admissible increase in | x — a | (the 
limit not depending on the number of 
terms taken). In the examples of the 
preceding section this was not evident 
because the series were specially 
chosen, but such good behavior is an 
exception rather than a general rule. 

Consider the following simple func- 
tion: 

V= 


Taking the derivatives in succession 
yields 


V' = 
y m 


1 

(i — x) 2 ’ y 

n\ 

~ (1 — x) n+ i ’ * 


1-2 

(1 — z ) 3 


Substituting x = 0, we get 
y (0) = 1, y'( 0) = 1, y "(0) - 2, . . 
y W (0) = n\ 


We thus arrive at the series 

— ^- = l + x-f -x 2 + x 3 

+ ... + x n +.... (6.3.1) 

The example of the function (1 — x)~ 2 
is remarkable not only due to the unusu- 
ally simple form of the resulting power 
series (all the coefficients are equal to 1). 
Here it is easy to give an exact formula 
for the sum of the first n terms of the 
series (6.3.1): 

i + x + x 2 + (6-3.2) 

The validity of this formula, the formu- 
la for the sum of the n terms of a geo- 
metric progression , is known to any se- 
condary-school student. (Just multiply 
both sides of (6.3.2) into 1 — x.) We can 
rewrite formula (6.3.2) as follows: 

1 + x + x 2 + . . . -f x 71 " 1 


Comparing (6.3.1) and (6.3.3), we see 
that x n /( 1 — x) is the remainder in 
the approximate equation 

~YZ~p — 1 +# + # 2 + . . . -f # n_1 , (6.3.4) 

that is, the quantity we neglect when 
confining ourselves to the first n terms 
in the series 

1 -{-x-\-x Zj rX 3 -\- . . .+x nj r . . (6.3.1a) 

which is the right-hand side of (6.3.1). 

If x lies between — 1 and 1, then the 
greater the n, the closer x 11 is to zero 
and, consequently, if we take a suf- 
ficiently large number of terms, we dis- 
card only a small quantity (remainder). 
Series that possess this property are 
known as convergent . The convergence 
of the series (6.3.1a) at x = 1/2 is illus- 
trated by the table below; here 1/(1 — 
x) = 2. The quantity A in the table is 
the error (in %) introduced by the ap- 
proximate formula (6.3.4), that is, the 
ratio of the difference between the left- 
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and right-hand sides of (6.3.4) to the 
left-hand side. 


n 

1 

2 

3 

1 +£ + ;r 2 + . . . 

+ 1 

1.5 

1.75 

A 

50% 

25% 

12.50/0 

n 

4 

5 

6 

1 + x + x 2 + • • - 

, + x n ~ 1 1.875 

1.9325 1.9637 

A 

6.25% 3.12% 

1 .56% 

Note that 

the closer x 

is to 

1, the 


more terms we have to take in order 
to obtain a given accuracy. 

The whole picture changes if we take 
| x |^1. For instance, if x > 1, each 
subsequent term in the series (6.3.1a) 
is greater than the preceding one. For- 
mula (6.3.3) remains valid, but for 
x > 1 the quantity x n increases with- 
out bound together with n and for this 
reason we cannot disregard the frac- 
tion x n /(l — x). Here, (6.3.1) does not 
hold true. There is not even any qual- 
itative similarity between the sum of 
the positive terms of (6.3.1a) and the 
negative (since x is greater than 1) quan- 
tity 1/(1 — x). For instance, at x = 2 
formula (6.3.1) becomes absurd: 

— 1 = l + 2 + 4 + 8 + 16 + . • •• 

For x > 1, the sum of the series (6.3.1a) 
increases without bound with n (this 
is clearly seen from (6.3.3)). And if 
x ^ — 1, in the right-hand side of 
(6.3.1) we have an alternating sum, 
which is not represented in any way by 
the quantity 1/(1 — x). For instance, 
at x = —1 formula (6.3.1) yields 

1/2 = 1 — 1 + 1 — 1 + 1 — 

Series similar to those into which (6.3.1a) 
transforms when | x | ^ 1 are called 
divergent . 

the terms of an infinite geometric pro- 
gression (6.3.1a) is equal to 1/(1 — x) 
if | x | < 1, but if | x | ^ 1, the in- 
finite geometric progression has no 
finite sum (it diverges), in other words, 
the sum of the first n terms of such a 



series does not tend, as n -> oo, to any 
finite limit. 

Also note that any periodic (decimal) 
fraction is a sum of terms of a geometric 
progression. For instance, 

l.(l) = l.llll.. . 

= 1 + 0 . 1 + 0 . 01 1 + 0 . 001 + ... 

= 1 + 0 . 1 +( 0 . 1) 2 + ( 0 . 1) 3 + ... 

1 __ 1 10 , 4_ 

— 1 — 0.1 — 0.9~ 9 ” 1 9 ‘ 

Thus, we have already encountered an 
elementary series (the geometric pro- 
gression) in arithmetic and algebra. 

The function y = 1/(1 — x) (Fig- 
ure 6.3.1) has a discontinuity at x = 1; 
if x is close to 1 but greater than 1, 
then 1/(1 — x) is a large (in absolute 
value) negative number, while if x is 
close to 1 but smaller than 1, then 
1/(1 — x) is a large positive number. 
Thus, when x passes through the value 
x = 1, the value of the function 1/(1 — 
x) jumps from large positive values to 
large (in absolute value) negative num- 
bers. A series cannot describe such 
behavior, and so it is of no use here. 

We note yet another circumstance. 

L ( ,\ x — 

x) becomes infinite (the closer x is to 1, 
the greater y is in absolute value), and 
at x = 1 the terms in (6.3.1a) cease to 
decrease as n ->■ oo. A series is suit- 
able for computational purposes only 
if its terms tend to zero rapidly, that 
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is, decrease in absolute value. 6 * 7 At 
x = 1, the formula (6.3.1) is incorrect, 
since the terms of the series on the right- 
hand side do not decrease. This means 
that the series (6.3.1a) is unsuitable for 
computing the values of the function 
y = 1/(1— x) atx = — 1 and for x < — 1 
(the terms in (6.3.1a) do not diminish 
either) although the function itself does 
not have a discontinuity at x = — 1 
and is equal to 1/[1 — (—1)1 = 1/2. 

No matter how we choose the coeffi- 
cients of a polynomial, the graph will 
always be a solid continuous line: a 
polynomial does not have discontinui- 
ties. Therefore, if some function / (x) 
has a discontinuity at x = x 0 (x Q — 1 
in the example involving 1/(1 — #)), 
then for the value x = x Q the series 
constructed for / (x) is definitely unsuit- 
able for computations. Since the great- 
er the absolute value of x , the greater 
(in absolute value) each term c n x n in 
the series c 0 + c x x + c 2 x 2 + ... is, it 
follows that for an arbitrary x that is 
greater in absolute value than x 0 , the 
series is likewise unsuitable for compu- 
tation. 

Thus, in the case of a discontinuity in 
a function / {x) we can indicate before- 
hand an x 0 such that for all x exceeding x 0 
in absolute value the series represent- 
ing / {x) will prove to be unsuitable 
for computational purposes. 

Consider another example. We wish 
to construct Maclaurin’s series for the 
function y = tan x. Applying the gen- 
eral rule for finding derivatives, we ob- 
tain 


y = tan x 


sm x 


cos x 
2 sin x 


y'w-cofz 

*/ IV (*) 


y 

y'" (*) 


i 


COS 2 X 9 

2+4sin 2 x 


cos 2 x 


IV , v 16 sin a; + 8 sin 3 x 


y v (x) 


cos 5 X 

16 + 88 sin 2 £+16 sin 4 x 
cos 6 x 


6 * 7 Of course, if one or two or several of the 
first terms increase, there is no harm done if 
the subsequent terms of the series fall off 
rapidly, see the example involving e x for 
x = 2 (Table 6.1). 


This yields y( 0) = 0, y'(0) = l r 

y"( 0) =0, y'" (0) = 2, y iv (0) = 0. 
y v (0) = 16, and so on. Whence, 

tan a: = 0+l*a; + 0-;r 2 + -■ | -j x 3 -\- 0-x* 


Thus, 

tan £ = a: + -i + ^ x 7 



x 9 + .... 


(6.3.5) 


(the coefficients of x 7 and x 9 can be 
obtained in the same manner as the 
coefficients of x, x 3 , and x b were ob- 
tained). 

What can be said of the range of ap- 
plicability of the series (6.3.5)? The 
graph of the tangent function (Fig- 
ure 4.10.5) tells us that this series is 
suitable for computational purposes 
only for | x | <. jc/ 2, since at x = jt/2 
the function’s behavior is as bad as that 
of function 1/(1 — x) was for x =1. 

But the appearance of the series 
(6.3.5) does not suggest the value of x 
for which the series cannot be employed, 
since the law by which the expansion 
coefficients are constructed is not simple, 
in contrast to the law for the coeffi- 
cients in series (6.3.1a). 6 - 8 

Note that the presence of a disconti- 
nuity of a function is a sufficient con- 
dition for the series to cease to con- 
verge, but it is not a necessary condition. 
By way of an illustration let us consider 
the function y = 1/(1 + x). Applying 
formula (6.1.19) or simply replacing x 
with — x in (6.3.1), we get 


_]_ x ~ 1 — x-\-x 2 — x 3J r • • *• (6.3.6) 

Take x = 2, for instance. Then 


1 

i+x 


\x=2 


1 1 
1 + 2 “ 3 • 


6 - 8 Note that the ratios of the successive 
coefficients in the series (6.3.5), 1 1/3 = 3, 

1/3 - 4 - 2/15 = 2.5, 2/15 -4- 17/315 - 2.4706, 
17/315 -4- 62/2835 ^ 2.4677, rapidly converge 
to the value (ji/2) 2 ~ 2A67A. However, there 
is no way in which we can prove this remar- 
kable fact here. 



6.3 Cases Where Series Expansions Cannot be Applied 


191 



The sum of the terms of the series 
1 — x + x 2 — x 3 + . . (6.3.7) 

however, oscillates rapidly as the num- 
ber n of terms increases, and does not 
resemble 1/3 in any way: 


n 1 2 3 4 5 6 7... 

Sum of n terms 

of series (6.3.7) 1 -1 3 -5 11 -21 43 . . . 


The series is clearly unsuitable for 
calculating at x = 2 the value of the 
function y = 1/(1 + x). Why does this 
occur, particularly since the function 
itself, y = 1/(1 + x), has no discon- 
tinuity either for 2 or anywhere be- 
tween 0 and x = 2 (within these limits 
the function is smooth, well-behaved, 
see Figure 6.3.2)? 

The reason why the series (6.3.7) 
does not “work” at x = 2 is that y = 
1/(1 + x) has a discontinuity at x = 
— 1. For this reason, at x = — 1 the 
absolute values of the terms of the se- 
ri as_ ( {T 3, 73 7 da noh diminish, asL n. cgswfL. 
But the absolute values of the terms in 
(6.3.7) do not depend on the sign of x. 
Consequently, for x — 1 (and all the 
more so for x > 1) this series is not 
suitable for computation. 

Thus, even if we are interested in 
the behavior of a series only for posi- 
tive values of x, we still have to take 
into account all the values of x , includ- 
ing negative values as well, for which 
the function undergoing expansion has 
a discontinuity. 



Figure 6.3.3 

For the reader acquainted with complex 
numbers (see Chapters 14 and 15) we note that 
the convergence of a series for real-valued x 
is affected by the behavior of the correspond- 
ing function for complex values of the inde- 
pendent variable. Here is an example (see also 
Section 15.2). Replacing x with x 2 in (6.3.6),. 
we get 

_l_ = l_* 2+ *4_ x6+ .... (6 .3. 8) , 

The graph of the function y — 1/(1 + x 2 ) (Fig- 
ure 6.3.3) has no discontinuities for positive- 
and negative x's and does not go to infinity, 
but the series (6.3.8) is suitable for computa- 
tions only if x 2 < 1 , that is, for —1 < x < 1 . 
The reason for this is that when x — ± V — 1 = 
=bi, that is, at x 2 = — 1 , the function y = 
1/(1 + x 2 ) becomes infinite and therefore the 
terms of the series do not decrease in absolute 
value at x 2 = — 1. Hence, neither do they de- 
crease in absolute value at x 2 = 1 (see p. 481). 

Other examples of functions that 
cannot be expanded in a Taylor’s se- 
ries, say, the function e -1 /* 2 at x = 0 
(see p. 198), can also be cited. Fraction- 
al powers of x and functions that con- 
tain fractional^ powers of the indepen- 
dent variable, say, the function sin Y x >- 
cannot be expanded in a Maclaurin’s 
series. 

Remaining within the scope of this 
book for beginners, we find it rather 
difficult to go any further into the ques- 
tion of the applicability of Taylor’s 
and, MnrJaurjn’a, sari or, 7 ta da this,, the. 
reader must turn to a textbook on the 
theory of functions of a complex va- 
riable (cf. Chapter 15). However, we 
must say, to comfort the reader, that 
practically all functions that a physicist 
or engineer needs can be expanded in 
Taylor’s series everywhere with the 
exception of a finite number of (singu- 
lar) points. As a rule we encounter func- 
tions that can be expanded, at least in a 
properly chosen finite interval,** into' 
Taylor’s series. 
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Exercises 


6.3.1. Write Maclaurin’s series for the 
functions (a) y = (x + 1)/(1 — x ) , and 
<b) y = In (1 + x). 

6.3.2. Write the Taylor expansion of y = 
In x in powers of x — 1. 

What are the ranges of applicability of the 
■series obtained in Exercises 6.3.1 and G.3.2? 

6.3.3. Suppose that / (x) and g (x) are 
known functions. Find the first three terms 
of the series expansion in powers of x of the 
function / (a-) g (x). Construct the same series 
by multiplying together the series for / (x) 
and the series for g (x). Compare the results. 


6.4 The Binomial Theorem for Integral 
and Fractional Exponents 

Let us form Maclaurin’s series expan- 
sion of a binomial a + x to an arbi- 
trary power m, that is, y = {a + x ) m . 

Using the general rule, let us first 
find the derivatives 

y' = m (a + 

y" = m (m — 1) (a + x) m ~ 2 , . . ., 

(6.4.1) 

y n = m (m — 1) . . . 

. . .(m — 72 + 1) (a + x) m ~ n , . . . 

and the values of the function and de- 
rivatives at x = 0: 

^(0) = a m , y' (0) = ma m ~ 1 , 
y"{ 0) = m (m — 1) a m " 2 , . . ., 

(6.4.2) 

y n (0) — m (m — 1). . . 

. . .(m — n + 1) a m ~ n y .... 

From this we get Maclaurin’s series 
(a + x) m = a m + r ^a m - i x 


. . in ( in 1 ) ( in 2) . . . ( n% — tl — J— 1 ) 

-t- . . . -j- — 

X a m - n x n + .... (6.4.3) 

If the exponent m is a positive inte- 
ger, then (a + x) m is a polynomial of 
degree m, so that in this case the series 

(6.4.3) is finite: the (m + l)st deriv- 
ative of the function (a + x) m is zero, 


and so are all higher derivatives. The 
formulas (6.4.1) to (6.4.3) reflect this 
circumstance. Indeed, at 72 = m + 1 the 
factor 772 — 72 + 1 vanishes, for n > 
772 + 1 there will be, some place in the 
sequence of the factors m, m — 1, . . 
a factor equal to zero and, the pro- 
duct will be equal to zero, too. 

For a positive integer tti, the product 
in the numerator can be written in a 
more convenient form. To this end we 
multiply and then divide the prod- 
uct 772(772 — 1). . .(772 — 72 + 1) by 
(772 — 72) (772 — 72 — 1). . .3-2-1. The 
resultis 

772 (772 — 1 ) . . . (772 — 72 + 1) 

_ m(m — 1 ) . . . 3 - 2-1 _ m\ 

(m—n) {m — n — i) ... 3 - 2*1 (m — n)\ * 

Thus, for positive integral m we finally 
have 


{a + x) m^ a m + _ 
. rn\ 


m\ 


(m — i)\ 


a m ~ i x 


2! (to — 2)! 


+ ... + 


a m -*x 2 


ml 


+ • • • + 


ml 


(to — 2)! 21 


to ! 

(to — 1)! 1! 


(6.4.4) 


In formula (6.4.4) we have polyno- 
mials of degree m on the right and on 
the left. Thus, for the case of a positive 
integer m, we obtain an exact equa- 
tion that is valid for arbitrary powers of 
x, that is, formula (6.4.4). This formula 
is symmetric with respect to x and a: 
the coefficients of the terms a m ~ n x n 
and a n x m ~ n are the same. This is obvi- 
ous since (x + a) m does not depend on 
the order of the summands in the pa- 
rentheses: ( x + a) m = (a + x) m 
Formula (6.4.4) is called the binomial 
theorem (sometimes it is called New- 
ton’s binomial theorem, see below) or the 
binomial expansion. It can be obtained 
without resorting to derivatives and 
series. We have to take the prod- 
uct (a -f- x)(a -f- x)(a -F x). . .(a - r x) 
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consisting of m cofactors, perform the 
multiplication, and collect like terms. 
However, when m is specified in the 
general form by a symbol and not a 
number, collecting like terms is ra- 
ther difficult. On the whole, the deriva- 
tion of the binomial expansion via me- 
thods of higher mathematics, that is, 
using Maclaurin’s series, is simpler. 

We note that Newton obtained the 
general formula (6.4.3), that is, the ex- 
pansion of (x + a) m for the case of an 
arbitrary exponent m. It would there- 
fore be more appropriate to call formu- 
la (6.4.3) Newton’s binomial theorem 
instead of (6.4.4), which was known be- 
fore Newton’s time 6 - 9 and which is a 
simple particular case of the relation- 
ship connecting the coefficients in the 
representations (6.2.1) and (6.2.1a) of 
a polymial, see Exercise 6.2.1. 

Let us return to the general formu- 
la (6.4.3). Suppose that m is not a posi- 
tive integer. Since the powers n of the 
variable x are positive integers, this 
means that not a single factor m , 
m — 1, . . ., m — n + 1 in the nu- 
merator of the coefficient of x n will 
vanish and, hence (6.4.3) yields an 
infinite series. For instance, for m = 
—1 this series is of the form 


1 _ 1 x . x 2 x 3 

a-\-x a a 2 ' a 3 a* ' 

(6.4.5) 

Note that at a = 1 formula (6.4.5) 

passes into the familiar formula (6.3.6) 
for the sum of the terms of an (infinite) 
geometric progression. 

From (6.4.5) we also find that 

4 \ X X 2 X 2 

i =.-Lj -4- — I- 1- 

a—x a ^ a 2 ^ a 3 ^ a 4 ^ * * * * 

For m — 1/2 we have 


Ya + x—Va>- 
i L *4 5 | 

16 a 2 ~\f a 128 a s j/" $ 
21 

1024 ya'"' 


1 x 2 


8 aV a 
7 x 5 


256 a 4> Y a 

(6.4.6) 


6 * 9 In Europe formula (6.4.4), where m is 
a positive integer, was first discovered by 
Niccolo Tartaglia (c. 1500-1557), but even 
before that the formula was known to Arabian 
mathematicians. 


In the expansion of (a + x) m with 
arbitrary m, all the terms have the same 
sum of powers of a and x , each subse- 
quent term differing from the preceding 
one by the factor xla and the coefficient. 
A physicist would say that a and x 
in formula (6.4.3) must have the same 
dimensions, and so xla is dimension- 
less. From the very beginning we could 
take a outside the brackets: 

{( 2 + x) m = a m (1 + x/a) m , 

and expand (1 + x/a) m in powers of 
xla . 

It turns out that for all m (fraction- 
al, positive, and negative) the series 
(6.4.3) is suitable only for | xla | < 1, 
that is, for | x | <C | a |. For | xla | ^ 1 
the series in (6.4.3) is divergent. The 
positive integers m are an exception 
because in that case formula (6.4.3) 
consists of a finite number of terms.. 

Putting a = 1 in (6.4.3), replacing x 
with r (where \ r | <C 1), and confining 
ourselves to the first two terms on the 
right-hand side of (6.4.3), we arrive at 
the following approximate formula 

(1 + r) m ~ 1 + mr, 

which was often used above, this for 
mula is valid only for small | r | (a 
more exact formula has the form (1 + 

r) m ~ 1 + mr + m 

Formula (6.4.6) offers a good meth- 
od for taking roots. Here, the smaller 
the ratio | xJa | the fewer terms one has 
to take in (6.4.6) to attain a specified 
accuracy. 


Exercises 

6.4.1. Using a se ries expansion, find 
Via and V 1.5 as Vl+x at a: =0.1 and 
x = 0.5 retaining two, three, and four terms 
in the expansion. Compare the results with 
the tabular values. 

6.4.2. Show that for | x | < 1 the approx- 
imate formulas (a) f/"l +x~ 1 +x/n, and 

(b) y 1 + x ~ 1 -j- — n - a: 2 are valid and 
n 2 n 2 

that the smaller the value of x, the more accu- 
rate formulas (a) and (b) are (formula (b) is 
more accurate than formula (a)). 


13-0946 
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6.4.3. Using the fo rmul as of th e pre ceding 
exercise, find y+.2, f/ 1.1, and f/ 1.05, com- 
pare the values obtained with tabular values. 

6.4.4. Find ]/*6 to three decimal places. 
[Hint. Employ the fact that 6 = 4 + 2 and 

= 2 and apply formula (6.4.6).] 

6.4.5. Why is it impossible to expand 
y— Y x by Maclaurin’s formula? Can the 
function y — Y z 1000 be expanded using this 
formula? 


6.5 The Order of Increase and Decrease 
of Functions. L’Hospital’s Rule 

The series expansion of functions yields 
a general method for reducing different 
functions to the same form and enables 
one to compare the functions. This meth- 
od of comparison is needed, for exam- 
ple when we consider the ratio of two 
functions, f (x)!g (, x ), for a value of the 
independent variable x for which the 
values of the two functions are close 
to zero. In the computation of deriva- 
tives it was demonstrated that the ratio 
of two almost-zero quantities can be 
a quite definite number. Such ratios 
may prove to be not very small (but 
not very large, either). In certain cases, 
this ratio may be equal to zero or in- 
finity (positive or negative), and such 
cases can be classified. A few examples 
will suffice. For the sake of simplicity 
of notation, we take examples in which 
the value of x that interests us is equal 
to zero. 

For small values of x, the functions 
sin x and tan x are also small. The 
function e x and cos x are close to unity 
and, hence, e x — 1 and 1 — cos x are 
small. Here the smaller the value of 
| x | the closer are the values of the 
functions sin x, tan x, e x — 1 , and 
1 — cos x to zero. 

Let us compare these functions with 
x . To do this, we write out their Mac- 
laurin-series expansions: 

^3 

sin£ = ;r g- + . . . , 

■ £3 

tan x — x + -g- -4- ... , 


A X 2 X 4 . 

1 cos x = ~2 24 " • • • > 

e x — l = x + -f + . . . . (6.5.1) 

This for one, yields (sin^)/x = 
1 — £ 2 /6+.... Hence, (sin x)/x-+\ 

as £->0, or lim- sin;r =1. Similarly y 

x-*- 0 x 

from (6.5.1) we find that 


tan x 

X 

r 2 

= 1+ ir 

+ ...-►l 

as 

o 

t 

1 — cos x _ 1 

x 2 

1 

as 

x i 

x 2 

2 

24 + •••->- 

2 

1 — cos a: x 

x 2 

24 ^ * * * u 

as 

x — ^ 1 


+ ...-*-1 

as 

x^ 


More complicated relations can also 
be found. For instance, from 

r 5 


sm x = x — 


x° 

120 ' 


tanx = a; + -^- x z + ^x h {- 


it follows that 

tan x — sin a: = y a: 3 + -g- + . 


and, hence, 

tan x — sin x 1 

z 3 2 


as x 


0. 


A scale can be constructed of the 
order of decrease of various functions as 
x tends to zero. If / {x) tends to zero 
as x -> 0, then we term the order of 
decrease (or order of smallness) of / (x) 
with respect to x the power of x that 
decreases just as rapidly as / (x). Pre- 
cisely, if we say that f {x) has fcth order 
of decrease with respect to x , this means 
that it decreases as x k , that is, the ratio 
/ (x)lx h has for its limit, as x ->■ 0, a 
finite nonzero number. 

Thus, sin x , tan x, e x — 1 decrease by 
order one as x ->■ 0, while 1 — cos x 
decreases by order two, and tan x — 
sin x decreases by order three with 
respect to x. 

In certain cases it is possible to de- 
termine the order of decrease without 
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Figure 6.5.1 

a series expansion. For instance, from 
the well-known Figure 6.5.1, where MP 
and NA are lines of the sine and the 
tangent corresponding to /_AOM = x, 
we can easily see that tan x > x > 
sin x (the reasoning is based on the fact 
that SaOMA ^sctr OMA < Saona )* 

Since sin x ~ tan x for small x's (be- 
cause in this case sin x ! tan x = cos x ~ 
1), we conclude that sin x ~ x and 
tan x ~ x when x is small, that is, 
sin x and tan x have the first order of 
decrease. Next, since 1 — cos x — 
2sin 2 (x!2), and sin (x!2) is of the first 
order of decrease, we conclude that 
1 — cos x must be of the second order of 
decrease. Finally, the function tan x — 
sin x can be written as sin z/cos x — 
sin x = (sin z/cos x) (1 — cos x). Since 
for small x's the function sin x is of the 
first order of decrease, 1 — cos x is 
of the second order, and cos x ~ 1, 
it is clear that tan x — sin x must 
be of third order of decrease. However, 
these concrete devices require a great 
deal of ingenuity and so precisely for 
this reason a general method that oper- 
ates without failure is particularly 
useful. 

Such a relationship between ingeni- 
ous solutions of individual problems 
and general methods is in evidence every- 
where: the properties of tangent lines 
to a parabola, the area of a circle, the 
volume of a pyramid, and the volume 
of a sphere were all familiar to the 
ancient Greeks, but only differential 
and integral calculus provided us with 
general and simple methods for solv- 
ing all problems of that type, methods 
that any engineer, physicist, or stu- 


dent can master, while the ingenious 
methods of ancient scholars were suit- 
able only for the best minds of the 
time. 

Using series, it is possible not only 
to find the ratio of a function to a power 
of x, but also the ratio of one function 
to another. Here are some examples: 


e x - 


sm x 


<y*2 /v»3 

-T+T-+- 

X s , 

*— r+- 


i + T -+... 

•r2 

1 T + 


e x —i 
1 — cos x 


as 


7*2 

*+—+••• 

X 2 X 4 

~2 "24* ' 


■0, 


1 + 1T+... 

X X 3 

~2 ~ 24 ”+ ••• 


00 as x- 


e x — 1 

Y~x 

-v 0 as 


3-2 

+... 

2 


Yx 

x— >-0. 


Ya 


- 0 , 


*3/2 


The coefficients of Maclaurin’s series 
are expressed in terms of derivatives. 
It is therefore possible to state the re- 
sults obtained by means of series in 
the form of rules referring to deriva- 
tives. If / (0) = g (0) = 0, then 

/(*) = m + f( o) x 

+ |/"(0)x 2 + . . ., 

e (*) = g (0) + f'(0) x 
+ | *"(0) X 2 + . . . 
can be simplified thus: 

/(x) = /'(0)x + -if(0)x 2 + .... 
g ( x ) = g'(0) x +~ g"{ 0) x 2 + ... . 


13 * 
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From this we have 

m f '_ (Q)*+4-/ g (0)* 2 +--- 

g{X) g' (0) x-\ — g" (0) * 2 +... 

_ f'(0)+-j-f'(0)x+--- ^ />( 0 ) 

g'(0)+4-e"(0)*+... 8 ' {0) 

as x-^0 
and, hence, 
if f (0) = g (0) = 0, 

then • - 4 -7 - ^ r as ^->0 (6.5.2) 

g(x) g (0) v 

(if, in addition, /'(0) = g'(0), then 
the ratio f{x)/g'(x) tends to the limit 
of the ratio f"(x)/g"(x) as ^->0 610 . 

Of course, rule (6.5.2) retains its 
meaning completely when f{a) = g(a)= 0, 
where a is an arbitrary number, and 
we are interested in the value of the 
ratio / (x)/g (. x ) when x is close to a: 


if / (a) = g (a) 0, then 


f(x) > f' (a) 
g (*) g' (a) 


as x-+ a 


(6.5.2a) 


(if, in addition, f(a) = g'(a), then the 
limits of the ratios /' {x)/g' (x) and 
f (x)!g {x) coincide as x ->• a, in view 
of which / (x)lg (x) ->■ /" ( a)!g " (a) as 
^->a). Here, one must use Taylor’s 
series instead of Maclaurin’s series (see 
Exercise 6.5.2). 

Thus, if the values of two functions 
are close to zero, that is, if the two 
functions vanish at a single value of 
the independent variable in the vicin- 
ity of which these functions are con- 
sidered, then the ratio of the func- 


6.10 

formula 


Since here, 
(6.5.2), lim 

x -*• 0 


in view of 

/'(*)__ f (0) 

g' (*) ~ g n (0) 


the same 
we arrive 


at 


r /(*) 
lim— t - r 
x-+0 g 


no) 

g"( oy 


Similarly, if / (0) — 


/' (0) = r (0) = g (0) = g' (0) = g" (0) = 0, then 


lim 

x-*- 0 


f(x) f( 0) 
g(x) g w ( 0 ) 


and so on. 


tions can be replaced with the ratio of 
their derivatives. This result is cal- 
led L’ Hospital' s rule (actually discov- 
ered by Johann Bernoulli and given 
to L’Hospital in return for salary). 6 - 11 

After studying series, it is more con- 
venient not to bother remembering some 
special rules for finding derivatives 
but, for small values of x, to use series 
in which the function is expanded in 
powers of x. Whenever there is a sum 
of different powers of x, we leave only 
the lowest-degree term when passing 
to small values of x . 

Just as we considered, for small x , 
the order of decrease of functions equal 
to zero when x = 0, we can examine 
the behavior of functions when x in- 
creases without bound , that is, as x -> 
oo. Here it is appropriate to expand 
the function in powers of Hx (since 
1/x is small by hypothesis) via substi- 
tution of t for l/x. If our function is 
a polynomial, then it is obvious that 
for large values of x only the highest- 
degree term in x is of importance, since 
all other terms are smaller: ax n + 

bx n - x + . . . = ax n ( 1 + -y -i- + . . . ) ; 

here the expansion in the parentheses 
tends to 1 as x oo. We can speak of 
the order of increase of a function, that 
is, that a function increases as x , as x 2 , 
as x 3 , etc. It is said that a function 
/ (x) that increases without bound in 
absolute value as x oo has kth order 
of increase if the ratio f(x)/x h has for 
its limit, as x -*• oo , a finite nonzero num- 
ber. It is also clear that any polynomial 
of degree n has an order of increase 
equal to n. 

A fact of prime importance is that 
the function e x increases faster than any 
power x n for x increasing without bound, 
that is, e x has an infinitely large order 
of increase, or k = oo. To prove this, 
use the series expansion (6.2.2) of e x 
which, as pointed out above, is valid 


6 - n The same term is used in similar situ- 
ations, for instance, the one mentioned in 
footnote 6.10 and those examined in Exer- 
cises 6.5.3 and 6.5.4. 
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Figure 6.5.2 

for any value of x. We have 

pX 11 1 

— _L j t L _i — __ 

x n x n ■ x "- 1 T T n! 

(rc + l)! + 0? + 2) ! + • ’ • (6.5.3) 

For a given n and a sufficiently large x , 
the fraction e x !x n will become as large 
as desired due to terms with positive 
powers of x on the right-hand side of 
(6.5.3). Clearly, the same goes for the 
function e hx with any positive value 
of ft (or for the function a x = e x ln 
with a > 1): setting kx — y, we find 
that 


eb* __ ] n eh* ^ 

x n * {kx) n ~ h ' y n 
i.e. as x->-oo. 


oo 


as oo, 

(6.5.4) 


Thus, any exponential function e hx , 
where ft > 0 (or a x , where a > 1) grows 
faster, as # ->• oo, than any power func- 
tion £ n (or cx n , where both n and c are 
positive); although for small values of 
x the graph, say, of y — x 1000 slopes 
upward much faster than the graph of 
the function y x — e 0 001x (Figure 6.5.2), 
with the growth of x the graph of func- 
tion y 1 will finally intersect the graph 
of function y, since y x will overcome y. 

As x ->■ oo, the exponential function 
with a negative exponent decreases faster 
than any negative-power function x ~ v . 
This assertion, for arbitrary n > 0, 
is written 


/ = = x n e' x ->0 as x ->■ oo 

(6.5.5) 


It is risky to use the series expansion of 
e~ x for large x to prove this because the 
expansion is an alternating expansion, 
so that the various terms compensate 
each other to a certain extent. We there- 
fore consider the reciprocal: 

1 // = x- n /e~ x = e*/x n . 

According to (6.5.4), for arbitrary n 
the quantity / _1 = e x /x n tends to oo 
as x ->■ oo which simply means that 
/ 0 as x oo. 

To summarize: in the limit, for large 
absolute values of the independent vari- 
able in the exponent, the exponential 
function e x depends more strongly on x 
than any constant power of x , that 
is, for any integer n , the function e x 
increases faster than x n and e~ x de- 
creases faster than x~ n . This is vividly, 
demonstrated in the table for x h and 
e x \ 


X 


1 3 

5 

10 

Xh 


1 243 

3125 

10 5 

e x 


2.72 20 

150 

2X10* 

X* 

e -x 

0.37 12 

21 

5 

e x 

x~ 5 


X 


20 

50 

100 

X* 


3X10 6 

3Xl0 8 

1Q10 

e x 


4xl0 8 

5xl0 21 

1Q43 

x b 

e -x 

0.01 

10” 13 

icr 38 

~e*~~ 

x~ b 


What we have said about the expo- 
nential function can be applied to the 
logarithmic function y = log a x, name- 
ly, the logarithmic function log a x 
(with an arbitrary base a > 1) increases, 
as x -> oo , slower than any (however 
small) power x n of the independent vari- 
able , while the function e~ x ! x for 
x > 0 and x 0 decreases faster than 
any positive power of x: 

0 as (6.5.6) 

*0 as x->0 and #>>0. (6.5.7) 

To prove (6.5.6) it is sufficient to in- 
troduce the substitution log a x = u, 
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x = a u , x n — a nu , which transforms 
the ratio of interest to us into ula nu = 
1 - i- % a nu /u , while to prove (6.5.7) it is 
sufficient to introduce the variable v = 
lAr, which transforms the ratio into 
er v !v~ n (compare with (6.5.5) and also 
see Exercise 6.5.5). 

If we turn to the function w = 
e -!/*% which is naturally assumed equal 
to zero at x = 0 (since for small; in 
absolute value, x , that is, large values 
of 1 Ar 2 , w = e -1 / x2 can be made as small 
as desired), we can state that the func- 
tion decreases, as x 0, faster than 
any power of x , that is, w!x n -> 0 for 
all n > 0. This implies, in turn, that 
it is impossible to expand this (seem- 
ingly well-behaved) function in a 
Maclaurin’s series (see the graph of 
this function in Figure 15.1.1). In- 
deed, if the formula (6.1.19) were valid 
for this function, then a number of 
first coefficients on the right-hand side 
of this formula could vanish and the 
series expansion would be w (x) = 
a h x h + a k+1 3^ +1 + . . ., with a k 4= 0, 
and the function would have, for small 
values of x , a finite order of decrease ( k ) 
with respect to x instead of an “infinite” 
order of decrease (cf. Section 15.1). 


Exercises 

6.5.1. Find 
(a) lim 

x -*0 


(c) 


lim 

x-*-0 


In ( l+a) 
x 

tan x —x 


P x 

(e) lim — : 

„0 sin a; 


x 2 

-1 


the 
, (b) 

, (d) 


following limits: 
lim jP, (! + «>-* , 

x-*- 0 

e*- 


(f) 


6.5.2. Prove 
L’Hospital’s rule. 

6.5.3. Prove that 
. . . = f n ~ l (x) = 0 at 
s(0) = g'(0) = g"(0)=. 

lim 1{x) = 

*-►0 ?(*) g m ( x ) 


lim 

x-v 0 

the 


lim 

x-*0 
sin x- 


x * 

-1 — tan x 
x 3 


-x 


x — tan x ’ 
variant (6.5.2a) 


of 


if f(x) = f'(x) = f'(x) = 
a; = 0, and, similarly, 
.. = g( n - 1) (0) = 0, then 

(where, of course, we 


assume that the last ratio, which can vanish 
or become infinite, exists). 

6.5.4. Prove that (a) if / (x) 0 as x oo 

and g (x) 0 as i-)- oo, the ratios / ( x)/g (x) 

and f'(x)/g'(x) tend to the same limit as 
x ->■ oo, and (b) if / (x)-> oo as x-+ a (where, 
possibly, a = oo and g (x) -► oo as x a, 

r f(x) f'M 

x-*-a g(*) x—a 8 & 


6.5.5. Prove that the function e~ 1 / x grows 
in absolute value, for x c 0 and 2 -^ 0 , faster 
than any (integral) negative power of x. 

6.6 First-Order Differential Equations. 
The Case of Variables Separable 

Above we encountered (e.g. see Sec- 
tion 3.6) the concept of a differential 
equation , that is, an equation that 
connects the (unknown) function y = 
y (, x ) and its derivatives: 

F { x , y, y’ , y\ . . .) = 0, (6.6.1) 

where F is a known function depending 
on several variables. The laws of sci- 
ence and the operation of various devices 
or mechanisms can often be described 
using the language of differential equa- 
tions, so that in many cases a substan- 
tial part of studying the phenomenon 
that interests us consists in analyzing 
and solving an appropriate equation. 
In Chapter 9 we will illustrate all that 
has just been said with examples taken 
from mechanics, a science whose basis is 
formed by N ewton s second law , the dif- 
ferential equation (9.4.2). 

The order of a differential equation 
is the order of the highest derivative 
which appears. Here we limit our dis- 
cussion to first-order differential equa- 
tions: 

F (x, y, y’) = 0, (6.6.2) 

or, if we solve this equation for the de- 
rivative y ' , 

y' = f (a;, y)- (6.6.3) 

The simplest case of an equation of 
type (6.6.3) is when / is independent of y, 
that is, when the equation has the 
form 

y' = f (*). (6.6.4) 

This equation implies that y is the an- 
tiderivative of function / (x), or the 
indefinite integral of / (x): 

y = j / {x) dx. (6.6.5) 

Chapters 3 and 5 were devoted to the 
solution of Eq. (6.6.4). Of course, this 
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equation is the simplest, and its solu- 
tion follows from the very definition of 
the infinite integral, speaking of (gen- 
eral) differential equations, it is use- 
ful to bear in mind this first example 
that sheds light on the general situa- 
tion. For instance, already in this exam- 
ple we see the ambiguity in the state- 
ment of the problem, namely, the 
presence of an infinitude of solutions to 
a given equation. For instance, in the 
case of Eq. (6.6.4) these solutions have 
the form 

y = G (x) + C, (6.6.5a) 

where G'(x) = f (x). The graphs of all 
the functional relations described by 
y = y (x) are obtained from a single 
graph by parallel translation along the y 
axis (see Figure 6.6.1). To select a uni- 
que solution to the problem, we must 
specify a value y = y 0 for a given ini- 
tial value x = x 0 of the independent 
variable, the so-called initial condition 
for differential equation (6.6.4); this 
initial condition determines the uni- 
que solution 

y = lto+ j f{x)dx (6.6.5b) 

x 0 

to'Eq. (6.6.4), that is, it determines the 
unique integral curve y = y {x) pas- 
sing through the given point M 0 = 
M o (x 0 , y 0 ). 

Many textbooks are devoted to the 
theory of differential equations, and we 
do not intend to repeat their content here. 
We will limit our discussion to equa- 
tions of type (6.6.3) that can be solved 
as simply as Eq. (6.6.4) for the antide- 



rivative (or in the same complicated 
way as Eq. (6.6.4), since finding the 
antiderivative may constitute a com- 
plex problem). Precisely, if the func- 
tion / (x, y) is the product g (x) h (y) 
of a function of x and a function of y, 
then the variables x and y in the equa- 
tion 


I y -=g(x)h(y ,) 


( 6 . 6 . 6 ) 


can be separated by carrying all terms 
dependent on x over to one side of the 
equation and the terms dependent on 
y over to the other. Such equations are 
called differential equations with 
variables separable . 

Let us divide both sides of Eq. (6.6.6) 
by h (y): 


1 

h(y) 


dy 

dx 


= g(x). 


(6.6.6a) 


and then multiply both sides of (6.6.6a) 
by dx: 

J± r = g( x )dx. (6.6.6b) 

Now, to find the solution to Eq. (6.6.6) 
we need only integrate the left- and 
right-hand sides of Eq. (6.6.6b): 

\w) = \ g{ ~ x)dx - {6 - 6 * 7) 


If these integrals can be computed, then 
we have the function y = y (x) in im- 
plicit form: 

H(y) = G(x) + C , (6.6.8) 


where H {y) = j and G ( x ) = 

j g (x) dx are definite forms of these 

integrals, and C is an arbitrary con- 
stant. 

The simplest example of an equation 
with variables separable is Eq. (6.6.4), 
which is obtained if we put g = / 
and h = 1 in Eq. (6.6.6). Substituting 
this into (6.6.7), we arrive at the solu- 
tion (6.6.5) to Eq. (6.6.4). An equally 
simple case is the one where / (x, y) = 
/ (y) in (6.6.3) depends only on y, that 
is, where (6.6.3) becomes 

v § = / Of), 


Figure 6.6.1 


(6.6.4a) 
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here, as we see, g = 1 and h = /. For 
one, the equation 

y' = Ay, (6.6.9) 


which we will often encounter below 
(e.g. see Chapter 8), also belongs to 
this type of differential equations of 
variables separable. The solution (6.6.7) 
to Eq. (6.6.9) assumes the form 


j dyly = k j dx, or In y = kx -f- C, or, 


finally, 


y = e hx+c = Ae hx , 


( 6 . 6 . 10 ) 


where we have put A = e c (since C 
is an arbitrary number, we conclude 
that A is an arbitrary (positive) num- 
ber). Thus, we see once more that 
the exponential function y = Ae hx is 
characterized by the proportionality 
(expressed by (6.6.9)) between the deriv- 
ative of the function and the function 
proper. 

Let us examine some simple examples 
of differential equations with variables 
separable. 

1. Suppose that 

{/' = or ^ = xy. (6.6.11) 


dy 

y 


Separation of variables yields 

xdx y whence j y^dy^ j xdx , or 

Iny — x 2 /2-\~ C, or, finally, 



y = exp (-y- + C) =^4exp (-y] , 

( 6 . 6 . 12 ) 

where, as before, A = e c . The graphs 
of the function (6.6.12) are depicted 
in Figure 6.6.2a. 

2. Suppose that 


y =* 


or 


dy_ ^ y_ 
dx x 


(6.6.13) 


Then j y -1 dy = j x~ l dx , or In | y | = 
ln|x| + C, or, finally, 
y = Ax (6.6.14) 


Figure 6.6.2 


Figure 6.6.26; here again | A | = e c ). 


3. 

if 



y' = 

x dy x 

— , or — 

y dx y 

? 

(6.6.15) 

then 

we obtain 

\ydy = 

j x dx. or 

y 2 l2: 

= x 2 /2 -f-C, or, 

finally, 


y 2 — 

x 2 = a, 


(6.6.16) 


where we have denoted 2 C by a 
(Figure 6.6.2c). 

Of course, all these examples were spe- 
cially selected, since here not only the given 
equation (6.6.3) is that of type (6.6.6) but the 
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Figure 6.6.3 

integrals in solution (6.6.7), too, can easily 
be calculated, so that the general solutions 
(6.6.12), (6.6.14), and (6.6.16) are easily found. 
But in real life the situation is not all that 
simple. In the general case, the right-hand 
side of Eq. (6.6.3), of course, need not sepa- 
rate into two factors each depending on x 
or on y; besides, if the variables do separate, 
this does not mean that the integrals in (6.6.7) 
can be expressed in terms of elementary func- 
tions. The only path here is to resort to the 
various methods of approximate (numerical) 
solution of differential equations. 

Note that Eq. (6.6.3) specifies the depen- 
dence of the slope y ' of the solution curve 
y = y (x) for this equation on the position 
of point ( x , y) in the plane; in other words, 
Eq. (6.6.3) specifies a field of directions in the 
zy-plane, that is, it specifies at each point 
M = M (x, y) a direction with a given slope 
k = y f = / (x, y) (Figure 6.6.3). This fact 
suggests a natural method for an approximate 
solution of Eq. (6.6.3). 

Suppose that we wish to find the curve 
y — f (x) that passes through a given point 
M 0 (x 0 , yo) in the xy-plane and such that the 
function y (x) satisfies condition (6.6.3). Let 
us move from point M 0 along the direction 
specified at this point to an adjacent point 
M i ( x i, 2/i), where x x = x 0 + Ax 0 , y x = y 0 + 
Ay 0 , and Ay 0 /Ax 0 = k 0 = / (x Q , y 0 ). Then 
from point M 1 we move along the direction 
specified at this point to another adjacent point 
M 2 (* 2 > 2 / 2)1 where x 2 = x 1 + Ax x , y 2 = y x + 
A y u and AyJAx 1 = k x = f (x u y*), and so on. 
Repeating this procedure many times, we ob- 
tain a broken curve (known as the Euler broken 
line; see Figure 6.6.4), which (for small “steps” 
Ax 0 , Ax x , etc.) yields a rough picture of the 
behavior of the integral curve (i.e. solution) 
y — y (x) to Eq. (6.6.3) (in a certain sense the 
Euler broken lines “converge” to the integral 
curve when the lengths of all of its “chains” 
tend to zero). Computer calculations carried 
out via this method prove to be simple, and 
the method on the whole is convincing and 
reliable. 

Another method of solving differential equa- 
tions is based on the use of power series (here, 
just as in the case with the Euler broken lines, 



we are speaking, of course, not only of equa- 
tions with variables separable). Suppose that, 
we have an equation of type (6.6.3) in which 
the dependence of / (x, y) on x and y is quite 
simple (the meaning of this rather hazy state- 
ment will be elaborated on below) and wish 
to find the solution y — y (x) to this equation 
that passes through a fixed point M 0 ( x 0 , y 0 ) 
in the ary-plane. We assume that the function 
y (x) in the vicinity of point x = x 0 , at which 
y 0 (# 0 ) = is expandable in a Taylor’s series 

y (x) #o "4~ a 1 (x jtq) a 2 (x Xq) 2 
+ a 3 (x- x 0 )s + . . . (6.6.17) 

Then 6 - 12 

y' (x) = a 1 + 2 a 2 (x — x 0 ) 

+ 3a 3 (x - x 0 ) 2 + . . . (6.6.18) 

Substituting (6.6.17) and (6.6.18) into the ini- 
tial equation (6.6.3), we in many cases arrive 
at the condition of equality of two power se- 
ries: the series (6.6.18) for y ' and the series- 
expressing the function / (x, y), where y is- 
given by the series (6.6.17) (it is at this point 
that it is desirable for the function / to be 
“simple”, that is, such that / ( x , a 0 + a x (x — 
* 0 ) + « 2 i x — x,) 2 + . . .) can be represented 
as a power series b 0 + b x (x — x Q ) + b 2 (x — 
x 0 ) 2 +...). Equating the coefficients of like' 
powers of x of the two series, we can then often 
use the system a x = fe 0 , 2a 2 = b x , 3 a 3 — b 2 , 
etc. to find numerically the coefficients a x , 
a 2 , ... of the series (6.6.17), that is, find 
the solution to Eq. (6.6.3) (since the coefficient 
a 0 is known beforehand). 6 - 13 

6 - 12 The reader will have to take it for grant- 
ed that a power series, like the one on the right- 
hand side of (6.6.17), can be differentiated 
term by term. We have not proved that this 
can be done, but the supposition seems quite* 
natural. 

6 - 13 Note that expansion (6.6.17), as we 
know, is only valid in a certain neighbourhood 
of point x = x 0 , whereby it often happens- 
that, upon reaching a certain point x = x ± , 
we are forced to discard expansion (6.6.17) 
and look for the solution in the form of a se- 
ries expansion in powers of x — x lt and so on. 
The examples we consider below are selected^ 
so that this difficulty does not arise. 
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To illustrate this method, let us take two 
simple examples, the first being of a purely 
illustrative value since the respective equation 
can be solved without resorting to series, the 
second being typical. 

1. Let us return to the equation considered 
above, (6.6.11), or y' — xy, and write 

y = a Q + a x x + a 2 x 2 + a 3 x s + . . .. (6.6.19) 
Then 

jj' — a 1 + 2 a 2 x + 3 a 3 x 2 + 4 a 4 x 3 + . . .. 

( 6 . 6 . 20 ) 

Substituting these two series expansions into 
(6.6.11), we get 

a x + 2 a 2 x + 3 a 3 x 2 + 4 a±x z + 5 a 5 x 4 

+ 6a 6 ^ 5 +...== a 0 x + a ± x 2 + a 2 x z 

-j~ a 3 x 4 -f- a^x b -f- a 5 £ 6 -{-.... (6.6.21) 


which implies that 

A _ «0 1 

<H-1. «2— 2 . a 3 — — — 1>3 . 

fl/Q CL g 1 

<l4= T = '2T’ a5== T" = 1-3-5 ’ 

— a 0 

6 6 2.4*6 ’ *“* 

This enables finding the general solution to 
Eq. (6.6.22): 

( T 7*3 7*5 >».7 \ 

T“ t T3 i ~l-3-5 - '“l-3-5-7+'") 

+ a 0 (l + ^-+^ + ^ + . . . ) , (6.6.23) 

where the value of a 0 can he determined from 
the initial conditions', for instance, if we know 
that z/ (0) = 0, then a 0 = 0 (see Exer- 
cise 6.6.4). 


Equating the coefficients of like powers of x , 
we get 


a a 0 a 

•Oj=0, = . «3 = — =°. 


-a ^ — * 




a a 

a* = -5-= 


2-4 ’ 

#0 


_ _ _A 

a 5 5- — 0, 
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where a 0 = y 0 = y (0). In view of these equal- 
ities we can write expansion (6.6.19) in the 
following form: 


M = a 0 



x 2 1 x 4 

2 'IT 




from which it follows that y = a^e*' 1 !^ (sub- 
stitute x 2 /2 for x in the right-hand side of 
( 6 . 2 . 2 )). 

2. Suppose that 

y' = xy + 1. (6.6.22) 


Exercises 

6.6.1. Solve the following differential equa- 
tions (with variables separable): (a) y' = y 2 , 

(b) y' = k ( x/y ), (c) y' = k (y/x), (d) y' = ®V» 
(e) y' = cos x sin y. 

6.6.2. Prove that the straight lines (a) 
y — 0 and (b) y — ±x depicted in Fig- 
ure 6.6.2a and Figure 6.6.26 are also integral 
curves of Eqs. (6.6.11) and (6.6.15), respec- 
tively. 

6.6.3. Point out the “singular” solution to 
Eq. (6.6.9) similar to the “singular” solutions 
of Eqs. (6.6.11) and (6.6.15) considered in 
Exercise 6.6.2. 

6.6.4. Prove that the integral curve (6.6.23) 
of Eq. (6.6.22) that passes through the origin 
can be written via the following formula: 

X 

y — e**/ 2 ®(x), where ®(;r)= j e~ l ^ dt. 

0 

6.6.5. Solve by the method of series ex- 
pansion (a) Eq. (6.6.9), (b) Eq. (6.6.13), and 

(c) equation (a) in Exercise 6.6.1. 


"This is not a differential equation with vari- 
ables separable, and it can be proved that its 
solution cannot be written in the form of an 
explicit dependence of y on x. Nevertheless, 
it is easy to express the solution to Eq. (6.6.22) 
in the form of a power series. 

Let us again employ expansions (6.6.19) 
and (6.6.20) and substitute them into (6.6.22). 
The result is 

■a! + 2a 2 ;r + 3 a 3 x 2 + 4 a 4 x 3 + 5 a h x 4 
T" 6a 6 £ 5 + . . . = 1 + (IqX + a^ 2 

— |- a 2 x z -f- a 3 x 4 -f- a^x b + . . ., 


6.7* The Differential Equation for 
Water Flow from a Vessel 

As one more example of a problem that 
requires solving a differential equation, 
we now consider the flow of water from 
a vessel (see Section 3.6). We will 
assume that the vessel has an opening 
near the bottom (see Figure 3.6.3) 
or that a small pipe is connected to 
the bottom, in view of which the 
water can flow out of the vessel, on 
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the other hand, we will assume that 
the vessel can also receive an inflow 
of water from some outside source. 
The statement of this problem is very 
simple and pictorial. At the same time, 
the mathematical methods required to 
describe the flow of water are also 
employed in more complicated and 
interesting problems. 

So let us imagine a vessel with water 
flowing in or out. We denote the volume 
of water in the vessel by F. This volume 
varies with time, which means that F 
is a function of time. What meaning 
has the quantity dV/dt ? 

It is quite clear that dV ~ V (t -f- dt) 
— V ( t ) is the amount of water that 
has entered the vessel during time dt 
(if dV is negative, then it is the amount 
of water that has left the vessel during 
time dt). Therefore, dV/dt = q ( t ) is 
the rate of change of the amount of water 
in the vessel. The quantity q (t) has 
a special name, flow rate. If q > 0, 
then water is entering the vessel; 
if q < 0, water is being discharged 
from the vessel and the amount (or 
mass) of water in the vessel diminishes. 
In the particular case where the flow 
rate is constant, that is, q does not 
depend on time, we have V (t 1) = 
F ( t ) -f q. Here q is the amount of 
water entering the vessel in unit time 
(the quantity may be negative as well). 
In general, in the course of a unit of 
time (which can be a second, an hour, 
or even a year) the value of q may 
change, whereby we define q as a 
derivative. The reader that has tackled 
the relationship between the average 
velocity and the instantaneous velocity, 
the ratio of increments and the deriva- 
tive (see Chapter 2) needs no further 
explanations. 

We know that 

£•=««>• ( 6 - 7 - 1 ) 

If we know the dependence of the flow 
rate on time, or the function q = q ( t ), 
the differential equation (6.7.1) in all 
respects is similar to Eq. (6.6.4), and 
the problem of finding F does not 


differ mathematically from the velocity- 
distance problem, which, as we know 
(see Chapter 3) is solved by computing 
a (definite) integral. 

For this problem to have a definite 
solution, we must know the amount 
F 0 of water in the vessel at a definite 
initial time t 0 . The condition that 
V = F 0 at t = t 0 is called the initial 
condition , which specifies a unique solu- 
tion to Eq. (6.7.1) (cf. Section 6.6). 

The amount (volume) of water receiv- 
ed by, or flowing out of, the vessel during 

ti 

time t 0 to t x is j q ( t ) dt. Whence the 

to 

amount of water in the vessel at time 
t ± is 

n 

V(*i)==Fo+J q(t)dt. (6.7.2) 

to 

This expression holds true for any 
time t x and, consequently, fully deter- 
mines the desired dependence of F 
on t v At t x = £ 0 , the integral in (6.7.2) 
is zero and F (£ 0 ) = F 0 . Thus, solution 
(6.7.2) does indeed satisfy the stated 
condition relative to the amount of water 
at time t 0 (the initial condition). 

Formula (6.7.2) can also be used 
for t x < £ 0 , but its meaning is different 
from that when t x > t 0 . For t x > t 0 
the quantity F (£ x ) is the amount of 
water that will be in the vessel at time 
t x if at time t 0 the amount of water was 
F 0 and the flow rate is given by the 
function q = q (£). For t x < t Q the 
quantity F (£ x ) is the amount of water 
that must be in the vessel at time t x 
so that at a later time, by time £ 0 , 
there should be an amount of water 
equal to F 0 , with the flow rate fixed 
by the function q = q (t). 

Instead of writing t x in (6.7.2) we 
can simply write £, although such no- 
tation is not rigorous because t is the 
integration variable, but this is not 
really confusing. Formula (6.7.2) takes 
the form 

t 

V(t)=V 0 +jq(t)dt. (6.7.3) 

to 
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Remember only that q ( t ) here is not 
the value of q at the upper limit of 
integration but the (variable) function 
of the variable of integration, which 
runs through all values from t 0 to t. 

Formula (6.7.3), which yields the 
solution to the water-discharge problem 
if the flow rate q (£) is given and so is 
the amount V 0 ~ V ( t 0 ) at the initial 
time t = t 0 , can also be obtained by 
somewhat different reasoning. From 
(6.7.1), by virtue of the definition of 
the indefinite integral, it follows that 

V(t)= [q(t)dt. 

Suppose that the indefinite integral 
of the function q ( t ) has been found 
in some way. Denote it by / (£). Then 

j q(t)dt = I (t) + C, 

where C is the constant of integration. 
From this, 

V (t) = I (t) + C. (6.7.4) 

To determine the constant of integra- 
tion, let us take advantage of the 
initial condition, that is, let us require 
that at t = t 0 we have V = V 0 . Sub- 
stituting t 0 for t in (6.7.4), we get 
V 0 = I (to) + C, whence C= V 0 — I (t 0 ). 
Substituting this value of C into (6.7.4) 
yields 

V (t) = V 0 + I ( t ) - / (to), 

which coincides with (6.7.3), since 

t 

j q(t) dt = I (t) | i = I(t)-I(t 0 ) 

u 

Formula (6.7.4) may be termed the 
general solution of Eq. (6.7.1). By 
choosing one or another value of C, 
we can, from formula (6.7.4), obtain 
a variety of particular solutions that 
correspond to different initial condi- 
tions. 

Ordinarily, however, the flow rate 
is not known as a function of time. 
More often we know a physical law 
describing the flow rate as a function 
of the water head, that is, of the height 



Figure 6.7.1 

of water level z (Figure 6.7.1). For 
example, when water is flowing out 
of a long thin pipe, we can assume 
that 

q = — kz , (6.7.5) 

where coefficient & is a positive constant, 
the minus sign meaning that the water 
is being discharged . When the water 
flows out through an opening in a thin 
wall, the law is quite different: 

g= — aY z (6.7.5a) 

(this law was established by a pupil 
of Galileo Galilei, one Evangelista 
Torricelli (1608-1647), an Italian phys- 
icist and mathematician). Also possible 
is a combination of a constant influx 
of water from above (q 0 ) and a discharge 
of water by law (6.7.5) or (6.7.5a): 

q = g 0 — kz or q=- q 0 — cl Y z. (6.7.6) 

In each of these cases, until the 
problem has been solved, we do not 
know the dependence of water level on 
time, z — z (t), and hence we do not 
know the flow rate. For this reason, 
the problem of determining V from the 
equation 

#-«<•> < 6 - 7 - 7 ) 

cannot be reduced to the preceding 
problem. 

Here we stated the problem in the 
general case for an arbitrary relation of 
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the flow rate q = q (z) as a function 
of the level z. Equation (6.7.7) involves 
two unknown quantities: the amount 
{volume) of water V and the water 
level z. It is clear that these quantities 
are not independent. A definite water 
level is associated with a very definite 
amount of water so that V is a function 
of 2 , or V — V (z). 

It is clear that the form of this 
function is determined by the shape 
of the vessel. For a cylindrical vessel, 
for example, of radius r 0 we have 
V (z) = Jtr^z, where z is the water 
level reckoned from the base of the 
vessel. For a conical vessel depicted in 
Figure 6.7.1, the formula for the volume 
of a cone yields V (z) = (1/3) S (z) z, 
where S (z) = n [r (z)] 2 is the area of 
a cross section of the vessel at level z, 
and r (z) is the radius of the cross 
section at level z. From the similarity 
of triangles, we find that r (z) = r 0 zlh, 
where r 0 is the radius of the upper base 
of the cone and h is the total height, 
so that V (z) = (1/3) ( nr 2 0 /h ?) z 3 . 

Substituting the function V = V (z) 
into Eq. (6.7.7) yields 


dV (z) 
dt 


dV {z) dz 
dz dt 


= q(z). 


The derivative dV/dz of the volume 
with respect to the height of water 
level, z, is equal to the cross-sectional 
area S (z) at height z (see formula 
(3.6.20)), or dV (z)ldz = S (z). The final 
equation is 

S(z)^ = q(z). (6.7.8) 


Clearly, this is a differential equation 
with variables separable. Let us con- 
sider the solution of this equation for 
different functions S = S (z) and q = 
q (z).«.“ 


6 * 14 Of course, other variants of water dis- 

charge from a vessel are also possible (say, 

a vessel with a shape that does not allow us 

to write out the functions S = S (z) explicit- 

ly) when an exact solution of the respective 

equation is impossible and one is forced to 

resort to approximate methods (see also the 

text at the end of Section 6.6 in small print). 


Let us rewrite Eq. (6.7.8) in the 
form dz/dt = q(z)/S(z) and put 
S(z)/q (z) = /(z). Then we have 


dz _ 1 

f (z) • 


(6.7.9) 


We have already dealt with such equa- 
tions (see Eq. (6.6.4a)). Let us rewrite 
Eq. (6.7.9) as 


dt 

dz 




(6.7.10) 


This notation fits the fact that we 
temporarily regard t as a function of z, 
that is, we seek not the law z = z (t) 
describing the change of water level 
with time but the inverse function 
t = t (z), and after finding it we will 
express z in terms of t. 

Let us multiply the left- and right- 
hand sides of (6.7.10) by dz and in- 
tegrate: 

t z 

J J f(z)dz. (6.7.11) 

to z 0 

From this it follows that 


t = t 0 -j- j f(z)dz. (6.7.12) 

z 0 

We have the solution to the problem: 
on the right is a function of z, on the 
left is the time t , so that t is expressed 
as a function of variable z (the reader 
will recall that t 0 is simply a number). 
Solution (6.7.12) also enables us, for 
each value of t , to find the correspond- 
ing value of z and satisfies the initial 
condition: z = z 0 at t = t 0 (at the 
initial time t = t 0 the level of the 
water in the vessel, z 0 , is given). 

Here are two concrete examples of 
the functions / (z). 

1. Water flowing out of a cylindrical 
vessel of radius r 0 through a thin pipe . 
Here S (z) = constant = and q = 
— kz, so that Eq. (6.7.8) assumes the 
form 

n <17=- kz ’ < 6 - 7 * * * * * ' 13 ) 


or 

-n rl — = kdt. 
u z 
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Integrating the left-hand side from 
z 0 to z and the right-hand side from t 0 
to t , we get 


— nr\ (In z — In z 0 ) = — nr 2 0 In — 

z o 


- nr\ In JjL = k(t-t 0 ). (6.7. 14) 


In this example it is easy to express z 
in terms of t. Indeed, 

In z = In z 0 — — (t — 1 0 ), 


i.e. * = z 0 exp[--j^-(t — f 0 )] . 


We consider two instants of time, t and 
t -f- AZ, and find the ratio z (t -[- At)/ 
z(t): 

z ( t -f- Af) 



(* + Af — £ 0 — * + *<>)] 

nrl 1 


It is quite clear that if the process 
starts at the water level z being equal 
to z st , then z will not change: the- 
discharge of water from the vessel 
is completely compensated for by the- 
influx of water, and the water level 
remains constant. Now suppose that 
the initial water level z 0 is higher 
than z st , so that the discharge (pro- 
portional to z) exceeds the influx and 
the level z diminishes. We will look for 
the difference z x — z — z st between z 
and the steady-state level z st . This 
difference, obviously, satisfies the equa- 
tion obtained by substituting z = z t - f- 
z st and q 0 = kz st : 


.2 d ( z l + z St) 

0 dt 


kz st — k (z t -f- z sl ) , 


i.e. nr: 


dz ± 

dt 


— kz i . 


But this equation differs from (6.7.13) 
only in notation, so that its solution 
has the form 


We see that this ratio depends only on 
At and is independent of t. Therefore, 
the water level z falls in equal ratio 
during equal intervals of time; it con- 
stantly diminishes exponentially, but 
will never reach the value z = 0, that 
is, the water will never flow out of the 
vessel completely. 6 15 

The problem changes little if there 
is a constant influx of water into the 
vessel, or q (z) == q 0 — kz. Equation 
(6.7.13) then assumes the form 

(6.7.15) 

Here it proves convenient to start by 
finding the so-called steady -state mode , 
that is, we look for a solution of this 
equation in the form z = z st = con- 
stant. Since z — z st and dz/dt = 0, we get 

kz s i =■ 0, z si = qjk . 


6 - 15 This paradox (the water will never 
flow out of the vessel) stems from the fact that 
Eq. (6.7.13) gives a relatively accurate picture 
of the process only when z > p, where p is the 
radius of the pipe through which the discharge 
occurs. 


z i = z i 0> exp [ — (t — 1 0 ) ] , 

where z[ 0) = z 0 — z st . The final result 
is 

z = Zst + (z 0 — Zst) eX P [ — -(t — o] •» 

here z tends to the steady-state level 
z s t of the water but never reaches it. 616 

2. Water flowing out of a conical 
vessel (altitude h and radius of base r 0 ; 
see Figure 6.7.1) through a thin pipe . 
Here S (z) = jtr^ (zlh) 2 and q = —kz. 
In this case we have the equation 

dz _ q (z) kzh 2 kh 2 1 

dt S ( z ) nrfe 2 nrl z ’ 

(6.7.16) 

which can be rewritten thus: 


8 ’ 16 The last statement remains valid for 
z o < z st ( see Exercise 6.7.3), which indicates 
that the steady state z = z st of the water in 
the vessel is stable (see also Section 9.3). 
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Integration yields 


nrl 
kh 2 


J«*— SMt-4) 


= t — t 0 . 

Here we can easily express 2 in terms 
of t : 

Z 2 Zn 


2 

or 


^ 0 ) ? 


==/ 1 


2 kh 2 




(6.7.17) 


This formula completely solves the 
problem. It is readily verified that 

dz _ kh 2 r 2 W , . 1 - 1/2 
~~ Jir§ L Z ° ftrg ° J 

kh 2 


so that 2 does indeed satisfy the equa- 
tion. It is also clear that 2 = z 0 at 
t = t 0 . The expression (6.7.17) permits 
finding the time when the vessel is 
completely emptied: 

z = 0 when t = t 0 + jj£- . (6.7.18) 

There is an interesting qualitative 
difference between these two examples. 
In the second case, the water completely 
flows out of the vessel during a finite 
time interval T = ( nr 2 0 /kh 2 ) z\!2 (see 
(6.7.18), while in the first case the water 
level 2 approaches the zero level z = 0 
asymptotically (it never reaches this 
level; in other words, an infinitely long 
interval of time is needed for the water 
to flow out completely). 


Exercises 


6.7.1. Consider the solution to Eq. (6.7.15) 
that satisfies the initial condition z 0 < z s t. 
How will the water flow out of the vessel in 
this case? 

6.7.2. Consider the case where the water 
flows out of a conical vessel through a thin 
pipe under the condition of a constant influx 
of water. [Hint. The differential equation of 

this process is tl = q 0 — kz.\ 


* * * 

While in Chapters 2 and 3 we dis- 
cussed the ideas underlying differential 
and integral calculus, Chapters 4 to 6 
were devoted to the tools of higher 
mathematics, and each person who 
wants to master mathematical analysis 
to such an extent so as to have the pos- 
sibility of employing the theorems and 
concepts in practical applications must 
become familiar with these tools. Of 
course, the main problem that con- 
fronts a scientist or engineer is the 

problem of adequately describing a 
problem of interest in mathematical 
terms (most often in the form of a 

differential equation); the analysis of 

the model that follows is purely mathe- 
matical (solution of a differential equa- 
tion) and can be entrusted to a mathe- 
matician or passed over to a programmer 
for processing on a computer. (However, 
the formulation of the model cannot be* 
reassigned to anyone; it must be carried 
out by a specialist who has a deep know- 
ledge of the phenomenon in question.)' 
On the basis of this the engineer may 
decide there is no need to study the* 
mathematical tools, but he, or she, 
will be deeply mistaken. Understanding 
the essence of higher mathematics is 
closely connected with mastering cer- 
tain mathematical tools: without these* 
tools there can be no real understanding 
of the principles. To take an example 
from another field, reading special 
literature in a foreign language, even, 
with the help of a dictionary (which 
plays the role of a programmer, so to* 
say), is impossible without knowing- 
the basic facts of the grammar of the 
language (the “theory” of the language) 
and without having a certain minimal 
vocabulary (the basic of “engineering” 
skills). Without knowing the grammar 
of the language we are helpless, since* 
we do not know what words (and in 
what form) we must look up in the 
dictionary, while not possessing even 
a minimal vocabulary means that the 
dictionary is useless, since the totally 
unfamiliar text is a stumbling block. 
However, even a small vocabulary of 
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words and expressions and a rudimen- 
tary knowledge of the grammar make 
possible the full employment of a dic- 
tionary. In the same way, a physicist 
or engineer often (but not always) needs 
only a basic knowledge of higher math- 
ematics and a smattering of technical 
skills. 

Chapters 4 and 5 give an overall picture 
{rather sketchy, perforce) of differen- 
tiation (finding derivatives) and integr- 
ation techniques. Finding derivatives 
is quite simple, and therefore it is 
desirable to develop this skill to a state 
when finding derivatives of simple func- 
tions is carried out automatically, so 
to say. Here there is no need to solve 
as many complicated problems as one 
can put his, or her, hands on, since a 
physicist or engineer seldom encounters 
complicated functions. It is much more 
important to know how to solve simple 
problems at first glance, so we recom- 
mend solving all the exercises on cal- 
culating derivatives collected in this 
book and not calculating derivatives 
of more complicated functions. If solv- 
ing problems from this book proves to 
be insufficient for developing the neces- 
siir^ skills, raadar* cam fisumdman^ 
problem book on differential calculus 
or even compile a list of arbitrary 
functions and try to find their deriv- 
atives. There is no need in trying to 
memorize the entire table of derivatives 
{see Appendix 1); it is much more ad- 
visable at the first stages in the reader’s 
studies to copy the formulas onto a 
card and use the compiled table when 
solving problems. The formulas that 
are used more often will be memorized, 
and finally there will be no need of the 
card. 

Integration techniques developed in 
Chapter 5 are much more complicated. 
We believe there is no need to devote 
too much time to these techniques. Of 
course, it is useful to know how to 
evaluate simple integrals, but it is 
much more important to know how to 
use VcWres 'ntt&gx uks . QA course, 

skills in integrating the simplest func- 
tions and the knowledge of the basic 


properties of integrals (say, a fluent 
use of the integration-by-substitution 
method) developed in the course of 
calculating integrals are necessary to 
every student of science and engineer- 
ing, since most often the integral that 
you will encounter will not exactly 
coincide with an integral in a table 
but can almost always be reduced to 
such an integral via simple transform- 
ations, including the change of vari- 
ables. 

Chapter 6 begins with the topic of 
series . The technique by which com- 
plicated functions (for instance, ob- 
tained through experiments) can be 
reduced to simple functions is one of 
the basic contributions of higher math- 
ematics to engineering practice — every 
physicist and engineer must be ac- 
quainted with it. The key to the tech- 
nique lies in the theory of expansion of 
functions into power series (Sections 6.1 
to 6.4); in other cases the expansion of 
functions into trigonometric series, that 
is, the representation of functions as 
linear combinations of trigonometric 
functions (see Section 10.9), constitutes 
a good addition to the technique. 

our discussion to elements of the cor- 
responding theory, which is now under- 
going rapid growth, for which there 
are several reasons; among them are 
the rapid development of the sections 
of pure mathematics that are widely 
used in the theory of function approxi- 
mation (functional analysis, the theory 
of functions of a complex variable, 
and other fields), practical reasons con- 
nected with the constant need to sim- 
plify complicated functions, and the 
invention of the electronic computer, 
which opens up new possibilities in 
simplifying functions and at the same 
time poses requirements on the form 
of the functions used in computer cal- 
culations. 

The second topic discussed in Chap- 
ter 6 is differential equations , which 
VW£&CfCittfSb Vlife Wi tfi 
and engineers. Indeed, a mathematical 
analysis of natural phenomena usually 
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starts with attempts to represent the 
laws of nature as differential equations — 
these connect the various (variable) 
quantities that describe the phenome- 
non of interest (such representation is 
usually “abstract”, that is, it gives 
only an approximate picture of the 
real phenomenon). In Section 6.7 we 
studied this general scheme using the 
example of water flowing out of a ves- 
sel. Note that the content of Section 6.7 
and the examples of differential equa- 
tions discussed in Section 6.6 are of an 
illustrative value only; they can be 
freely replaced with any other examples 
and there is no need to memorize 
them (although the concept of a first- 
order differential equation with vari- 
ables separable is so important that 
it is advisable to master it). In this 
respect it is appropriate to note the 
difference between Chapter 4 and the 
last sections of Chapter 6: while the 
reader must carefully read Chapter 4, 
Section 6.6 and especially Section 6.7 
can just be browsed through, so as to 
understand the principal points. 


Higher mathematics was created by New- 
ton and Leibniz in the 17th century, and even 
in their works it was a highly developed dis- 
cipline. Of course, its development did not 
stop there. Newton and Leibniz did not con- 
fine themselves to the basic definitions and 
initial theorems. Their contribution was much 
greater. Newton clearly realized the impor- 
tance of differential equations in analyzing 
natural phenomena. Take his famous “Prin- 
cipia” (1687), which contains Newtonian me- 
chanics (the Newtonian world system), and 
you will see it starts with the differential 
equation of motion (see Eq. (9.4.2)). This equa- 
tion is taken as an axiom, whereas all sub- 
sequent propositions of mechanics are theo- 
rems derived from this axiom (and also from 
the law of gravity that follows from experi- 
mental findings (Kepler’s laws) and axiom 
(9.4.2)). In his mathematical investigations 
Newton formulated “the main problems of 
mathematical analysis” as follows: 

(1) from a given relationship between flu- 
ents (initial functions) to determine the re- 
lation between fluxions (derivatives); 

(2) from a given equation containing fluxions 
to find the relationship between fluents. 

The first of these problems is obviously a 
roblem of the differentiation of known com- 
ma tions of functions: thus, in Newton’s no- 


• • • 

tation, if z = up, then z = uv + uv % where 
• • • 

u, v, and z are fluents and u, v, and z are flux- 
ions. The second problem is a problem of the 
solution of differential equations. 

Newton also devoted much attention to 
infinite series (discovery of the binomial se- 
ries made a great impression on him 6 * 17 ). The 
whole of Newton’s theory of fluxions and flu- 
ents (derivatives and integrals) was obviously 
the fruit of his investigations into the theory 
of infinite series. Nor is it accidental that the 
basic theorems (see formulas (6.1.18) and 
(6.1.19)) of the theory of series are linked with 
the names of Newton’s pupils and associates: 
Brook Taylor (1685-1731), who was Secretary 
to the Royal Society of London (the British 
academy of sciences) when Newton was its 
President, and Professor Colin Mafllaurin 
(1698-1746) of Edinburgh University, who 
knew Newton personally and greatly admired 
him. Incidentally, both formulas, (6.1.18) 
and (6.1.19), were first recorded by Taylor. 
Maclaurin’s contribution was that he posed 
the question of the sphere of applicability of 
the formulas and partially answered it. The 
exact formula (6.1.16), the one containing a 
finite number of terms, was established by 
Lagrange (we will return to him again later 
on). 

Leibniz, too, made important contribu- 
tions to the theory of series and to differential 
equations in the solution of various geometri- 
cal problems. Note that Leibniz was more in- 
terested in geometry than in mechanics: he 
associated the concept of the derivative with 
a tangent and not speed. Significantly, his 
fundamental memoir on the “new mathe- 
matics” had an involved title: “A New Method 
of Maxima and Minima, and Also of Tangents, 
for Which Method Neither Fractions Nor Ir- 
rational Quantities are an Obstacle, and a 
Special Kind of Calculus for This”. 

The excellent school of Leibniz, headed by 
the two brothers, Jakob Bernoulli (1654-1705) 
and Johann Bernoulli (1667-1748), made an 
outstanding contribution to the differential 
and integral calculus. They were the founders 
of the famous Swiss family of mathemati- 
cians, which in the 17th and 18th centuries 
produced eight first-class mathematicians. 
(The most prominent of the later members of 
the family was Johann’s son Daniel Bernoulli 
(1700-1782), who worked in Basel and St. 
Petersburg and was one of the founders of hy- 
drodynamics and the kinetic theory of gases. 
We will have occasion again to discuss his 
work.) 


6 * 17 For an account of how this outstand- 
ing discovery was evidently made, see, for 
example, G. Polya, Mathematical Discovery 
(i On Understanding , Learning and Teaching 
Problem Solving ), Vol. 1, Wiley, New York, 
1962, pp. 91-93. 


14-0946 
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Johann Bernoulli 


Jakob Bernoulli received a theological 
education, but his interest in mathematics 
prevailed. He studied the mathematical lit- 
erature by himself and came upon the works 
of Leibniz, which made a profound impression 
on him. In fact, they s© overwhelmed him 
that he gave up the secure life of a pastor and 
for a number of years held the little-respected 
and low-paid post of a home tutor. It was only 
thanks to the good offices of Leibniz that in 
1683 he was appointed a professor (at first, 
of physics, and later of mathematics, too) 
at the University of Basel. 

Johann Bernoulli’s father wanted him to 
go into commerce, hut thanks to his elder 
brother, who had taught him mathematics and 
physics, his clearly expressed scientific in- 
terests ran counter to his father’s wishes. Since 
Switzerland in those years provided very 
little opportunity for the study of mathe- 
matics, Johann Bernoulli was compelled to 
obtain a medical education and for many years 
earned his living as a physician. His doctoral 
dissertation on the movement of muscles was 
interesting in that it was evidently the first 
attempt to apply the methods of higher math- 
ematics to physiology. On the recommenda- 
tion of the famous Christian Huygens, Johann 
was appointed to a professorship in physics at 
Groningen (the Netherlands) in 1695. Return- 
ing to Basel in 1705, he at first could obtain 
a university post only as a professor of Greek. 
It was only after the death of his brother Ja- 
kob that Johann Bernoulli was appointed pro- 
fessor of mathematics at the University of 
Basel and held this post to the end of his life. 

The Bernoulli brothers maintained a live- 
ly correspondence with Leibniz, who time 
and again expressed his admiration for their 
successes. It was precisely in the course of this 
correspondence that mathematical analysis 


received its present form, basic symbolism, 
and terminology. For example, originally 
Leibniz had been inclined to speak of “differ- 
ential calculus” and “summational calculus” 
as the two branches of the “new mathematics, n 
but on a suggestion from Johann Bernoulli 
he finally chose the Latinized term “integral 
calculus” (instead of “summational”). Jakob 
and Johann Bernoulli greatly advanced the 
new calculus, obtaining, in particular, im- 
portant results in the theory of differential equa- 
tions (see Section 6.6) and laying the founda- 
tions of what is called calculus of variations , 
which is mentioned in passing in Section 7.2 
(see also the text above Example 5 in Sec- 
tion 7.1). 6 * 18 

The first printed textbook of differential 
and integral calculus, entitled “Infinitesimal 
Analysis for the Study of Curves” (note how 
it echoes the famous memoirs of Leibniz) ap- 
peared in 1696. Its author was Guillaume 
F. A. L'Hospital (1661-1704), a French noble- 
man (Marquis de St. Mesme) who was a pupil 
of Leibniz and Johann Bernoulli. This excel- 
lent textbook went through many editions 
and was translated into many languages. The 
ideas L’Hospital set forth closely follow the 
lectures which he heard Johann Bernoulli 
deliver in Paris; they also follow Bernoulli’s 
manuscript manual “Lectures on Differential 
Calculus,” which was discovered in the li- 
brary of the University of Basel in the 20th 
century and was published for the first time 
in 1922. Thus, while Bernoulli’s text remained 
unknown, the lectures which were based 
on it and were attended by only one student. 


6 * 18 Variational calculus became an inde- 
pendent scientific discipline in the 18th cen- 
tury thanks to the works of Euler and La- 
grange. 
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Leonard Euler 


L’Hospital, exerted a tremendous influence on 
the entire subsequent development of higher 
mathematics (in particular, it spread the sym- 
bols and terminology of Leibniz). Among other 
things, L’ Hospital’s textbook saw the first 
publication of the method of calculating limits 
which Johann Bernoulli taught his pupil and 
which is now with so little justification simply 
called “L’Hospital’s rule” (see Section 6.5). 

Another of Johann Bernoulli’s pupils was 
the Swiss mathematician Leonhard Euler 
(1707-1783). He was introduced to Bernoulli 
by his father, Pastor Paul Euler, who had 
once studied mathematics under Jakob Ber- 
noulli. Pastor Euler wanted his son to be a 
pastor too, hut Johann Bernoulli convinced 
Paul Euler that the boy had outstanding 
mathematical ability. On Johann Bernoulli’s 
recommendation Leonhard Euler went to 
St. Petersburg in 1727, where an Academy of 
Sciences had recently been founded and where 
two of his teacher’s sons (one of them was Da- 
niel Bernoulli) were working. It was orig- 
inally planned that Euler would take the 
vacant post of Professor of Physiology at the 
St. Petersburg Academy, and he made a thor- 
ough study of this science in order to follow 
his teacher’s example and apply mathematical 
methods in it. But when he arrived in St. 
Petersburg, he found that the post of professor 
of mathematics was also vacant. So he gave 
up physiology and devoted himself to mathe- 
matics, physics, mechanics, and astronomy 
(for one, the movements of the moon). 

Leonhard Euler spent a large part of his 
life in St. Petersburg. There was a break of 
25 years when, on invitation from King 
Frederick II of Prussia, he moved from St. 
Petersburg, in those years a place not very con- 
ducive to calm scientific research, to Ber- 



Jean Le Rond d’Alembert 

lin. 6 19 Here he became head of the physics 
and mathematics department of the Berlin 
(Prussian) Academy of Sciences, but he kept 
up his close contacts with the St. Petersburg 
Academy. Euler was the most prominent and 
most productive mathematician of the 18th 
century. His work dealt with literally every 
sphere of mathematics and mathematical phys- 
ics. We write about some of Euler’s findings 
below (see Section 10.8 and Chapters 14 and 
15). Euler’s multivolume textbook of differ- 
ential and integral calculus was translated 
into other languages. 

Among Euler’s contemporaries was Jean 
Le Rond d' Alembert (1717-1783), a distin- 
guished French mathematician and author of 
treatises on mechanics. His exceptional scien- 
tific versatility is particularly evident in his 
collaboration with Denis Diderot (1713-1784). 
They produced the famous “Encyclopedic, ou 
Dictionaire Raisonne des Sciences, des Arts 
et des Metiers,” in which he wrote nearly all 
the articles dealing with the natural sciences. 
D’Alembert was named after the church (Jean 
Le Rond; literally, Jean the Round) on the 
steps of which he was found as an abandoned 
infant. Notwithstanding this handicap he be- 
came a prominent man of science. In Section 
10.8 we describe a most edifying and fruitful 
scientific debate in which Leonhard Euler, 
d’Alembert, and Daniel Bernoulli took part. 
Both Euler and d’Alembert strove for concrete 


6 ‘ 19 This refers to the “Biron period” in 
Russian history, named after E. J. Biron, a 
disreputable favorite of Empress Anne (1693- 
1740), whom she brought from Courland. The 
reign of Empress Anne, from 1730 to 1740, 
was marked by arbitrary arrests and execu- 
tions of innocent people and total disorgani- 
zation of the machinery of government. 


14 * 
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results, displaying outstanding intuition in 
mathematics and physics and a lack of inter- 
est in scholastic discussions of mathematical 
ideas. This is illustrated, for example, by 
d’Alembert’s well-known call to the young: 
“Keep on working— complete understanding 
will come in time.” 

When Euler decided to return to St. 
Petersburg after having headed the physico- 
mathematical department of the Berlin Acad- 
emy from 1741 to 1766, the question of his 
successor arose. Euler recommended the young 
mathematician Joseph Louis Lagrange (1736- 
1813), who was born into a French family that 
had moved to Italy. Lagrange was only 30 
at the time. His candidature was enthusiasti- 
cally supported by d’Alembert, who corre- 
sponded with King Frederick II. Lagrange was 
already a well-known scientist. He had begun 
to teach in the Turin Artillery School in 1726, 
when he was 20, and the following year he was 
one of the founding members of the Turin 
Academy of Sciences, in whose proceedings 
he published many papers. In 1787, after the 
death of Frederick II, Lagrange moved to 
France, where he played an outstanding role 
in the rise of the Parisian Polytechnical School 
(Ecole Poly technique), an institution of high- 
er learning of a new type, which trained re- 
search engineers. 6 » 20 It was during the Parisian 
period that Lagrange compiled his two-volume 


6 * 20 The example of the famous Polytech- 
nical School of Gaspard Monge (1746-1818) 
and Lagrange was undoubtedly taken into 
account by the founders of the Moscow Physico- 
Technical Institute in the Soviet Union and 
the founders of the Massachusetts and Cali- 
fornia institutes of technology in the United 
States. 



Augustin Louis Cauchy 


“Analytical Mechanics” (1788), which greatly 
furthered mathematics and physics. Lagrange 
published an excellent, and in many respects 
revolutionary, textbook of mathematical 
analysis (“The Theory of Analytic Functions,” 
1797) based on lectures that he delivered at 
the Polytechnical School. He is also known for 
his fundamental studies in algebra. In the 
breadth and range of his mathematical inter- 
ests and in the brilliance of his achievements, 
Lagrange was second perhaps only to Euler 
among the mathematicians of the 18th cen- 
tury. 

One of the pupils and ardent admirers of 
Lagrange was Jean Baptiste Fourier (1768- 
1830), a professor at the Polytechnical School 
whose name is linked with outstanding achieve- 
ments in the field of partial differential equa- 
tions (also called equations of mathematical 
physics ; see Section 10.8), for one, in the theory 
of the propagation of heat and in the theory of 
trigonometric series (see Section 10.9). 

Another professor of the Polytechnical 
School was the famous Augustin Louis Cauchy 
(1789-1857), who created the modern theory 
of limits (incidentally, his main ideas were 
borrowed from d’Alembert), which helped to 
clarify all the concepts of higher mathematics. 
Cauchy is rightly regarded as the father of the 
theory of functions of a complex variable (see 
Chapters 14, 15, and 17). He shares this honor, 
incidentally, with Georg Friedrich Bernhard 
Riemann (1826-1866), the German mathema- 
tician and one of the most prominent scientists 
of the 19th century, who made outstanding 
contributions in literally every field of mathe- 
matics and mathematical physics. 

While Cauchy was a representative of the 
Polytechnical School of Paris, Riemann was 
connected with another educational and scien- 
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Georg Riemann 

tific center that played a big role in the devel- 
opment of mathematics and physics in the 


19th and 20th centuries— the University of 
Gottingen in Germany. Here he attended lec- 
tures by the famous mathematician, physicist, 
astronomer, and geodesist Karl Friedrich Gauss 
(1777-1855), who is usually acknowledged to 
be the first mathematician of the 19th century. 
At the University of Gottingen Riemann pre- 
sented the first-ever course in the theory of 
functions of a complex variable, as well as 
a course in the theory of partial differential 
equations. His lectures in this second course, 

E ublished after his death, were the first text- 
ook on the equations of mathematical phys- 
ics. 

David Hilbert (1862-1943) belongs to an 
altogether different generation of Gottingen 
scientists. He is regarded as the father of func- 
tional analysis , which deals with the functions 
of functions, or functionals (see the text pre- 
ceding Example 5 in Section 7.1). Today this 
branch of mathematics includes generalized 
functions (or distributions), such as Paul 
Dirac’s delta function (see Chapters 16 and 17). 

Today the ideas of mathematical analysis 
continue to undergo intensive development, and 
computers are giving them a new and powerful 
impetus. 




Chapter 7 Investigation of Functions. 
Some Problems from Geometry 


In Chapters 1 to 6 we introduced the main con- 
cepts of higher mathematics, the concepts of 
a derivative and an integral, and suggested some 
techniques for using them. The present chap- 
ter, which concludes Part 1, partially devel- 
ops the material of the previous chapters. 
For instance, here we will dwell in detail on 
the methods used to find the maximum and 
minimum points of a function, of which we 
spoke briefly in Section 2.6. We will also tell 
about the basic geometrical applications of 
differential and integral calculus. Some themes 
in the chapter can be discussed in extracurric- 
ular studies and are written for a reader with 
an inquiring mind (e.g. see Sections 7.4, 
7.6, and 7.7). 

Since this chapter was written as an addi- 
tion to Chapters 1 to 6, it is naturally of a 
patchy nature. Loosely, the material can be 
grouped as follows: Sections 7.1 to 7.3, Sec- 
tion 7.5, Sections 7.9 to 7.11, Section 7.12, 
Sections 7.4 and 7.8, Section 7.6, and Section 7.7. 

7.1 Smooth Maxima and Minima 

The problem of finding the value of x 0 
for which a given function y = / (#) 
attains a maximum or minimum is not 
solvable in general form by the tools 
of elementary algebra. 

In Chapter 2 we established that at 
points where a function has a maximum 
or a minimum the derivative is equal to 
zero . It was also shown there how, 
using the derivative y' , to establish 
exactly what the function has at the 
given point x 0 (where /' (# 0 ) = 0), a 
maximum, a minimum, or an inflection 
(neither a maximum nor a minimum). 
To do this we were forced to compute 
the values of y' for values of x close to 
.x 0 on the right and on the left of x 0 . 
For instance, we found that the con- 
ditions y' (, x ) >0 for x<^ 0 , y ' = 0 
at x = x 0 , and y (x) <0 for x > x 0 
imply that at point x 0 the function 
y = y (. x ) attains a maximum. In Chap- 
ter 2 we also investigated another 
method for finding the maxima or 
minima of a function, a method that 
invokes the second derivative y" (only 
its value at point x = x 0 was required, 
however), namely, we found that, if, 


say, f(x 0 ) = 0 and f"{x 0 ) < 0, then 
at point x = x 0 the function / (#) has 
a maximum. 

The same conclusions can be drawn 
from the considerations involving the 
idea of convexity of the curve repre- 
senting a function; we touched on this 
concept in Section 2.7 and will return 
to it in Section 7.4. Indeed, the con- 
dition f(x 0 ) = 0 means that the tan- 
gent to the graph of the function at 
point x = x 0 is horizontal. From the 
fact that f"(x 0 ) < 0 it follows that 
point x = x 0 is a point of convexity , 
that is, the curve representing y = / (x) 
close to x = x 0 lies under the tangent 
(see Figure 2.7.1). But these two facts 
simply mean that the function / (#) 
has a maximum at point# = x 0 . Similar 
reasoning shows that if f (x ± ) = 0 and 
f" i x i) > 0, at point x = x x the func- 
tion / (x) has a minimum. 

These conclusions are also obtained 
by considering Taylor's series 

f{x)=f (x 0 ) + /' (x 0 ) (x — x 0 ) 

+ 4-/"(*o)(*-*o) 2 + ... (7.1.1) 

for the function / (#) at point x = x 0 . 
If f'(x 0 ) 0, then for x close to x 0 

the quantities (x — # 0 ) 2 , (x — # 0 ) 3 , etc. 
may be neglected when compared with 
x — x 0 . We therefore arrive at the 
approximate expression 

f{x)—f (*o) + /' (*o) ( x ~ x 0 ) , 
or f (x) — f(x 0 ) ~ /' (x 0 ) (x — x 0 ). 

From this we see that if, say, f(x 0 ) >0, 
then f(x) — f(x 0 ) >0 (i.e. / (#) > 
/ (# 0 )) for x > x 0 and / (x) — f(x 0 ) < 0 
(i.e. /(#)</ (# 0 )) for x < x 0 . This 
means that at x = x 0 the function 
has neither a maximum nor a minimum. 
Similarly, there is no maximum or 
minimum when /' (# 0 ) <! 0, only in 
the first case the function is increasing 
at point # 0 , while in the second it is 
decreasing. 
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But if f{x o) = 0, then the term 
with (x — x 0 ) in (7.1.1) vanishes and 
we cannot ignore the term containing 
(x — # 0 ) 2 . Ignoring terms with (x — ;r 0 ) 3 , 
(x — x 0 ) 4 , etc., small as compared with 
the term with (x — x 0 ) 2 , we get, from 
(7.1.1), 

/ (a:) ~ / (x g ) + jp f" (x 0 ) (x—x 0 ) 2 (7.1.2) 

(the approximate formula (7.1.2) is 
the more exact the smaller | x — x 0 | 
is). From this we see that if f"(x 0 )>0, 
then f (x) > f (x 0 ), irrespective of 
whether x <C. x 0 or x > x 0 , that is, 
/ (x 0 ) is less than any adjacent value 
of / {x) and therefore / (a; 0 ) is a minimum 
of the function. If f" (£ 0 ) <C 0, then 
/ (x) < / (x 0 ) and / (x 0 ) is a maximum 
of the function. 

It may, however, happen that both 
f{x 0 ) = 0 and f"{x 0 ) = 0. We then 
have to take the next terms in Taylor’s 
series (7.1.1). If f"'{x 0 ) 0, then we 

cannot neglect the term with (x — x 0 ) s , 
but we can freely ignore the terms with 
(x — x o y, (x — £ 0 ) 5 , etc., which are 
small if compared with the term with 
(x — £ 0 ) 3 . Then we have 

/ (x) ~ / (x () ) + J f” (x 0 ) (x — x 0 ) 3 , (7. 1.2a) 

and we have neither a maximum nor 
a minimum, just as we had neither 
a maximum nor a minimum at x = 0 
for the function y = x 3 (see Figure 1.5.1; 
for this function y' (0) = y" (0) = 0). 
But if /'(s 0 ) - /"(s 0 ) = /"'(s 0 ) - 0 
and / IV (z 0 ) ¥ z 0, then in the vicinity 
of point x 0 we have 

H*) — 1 (*o> + ^4 / IV W { x — ^o) 4 - 

(7.1.2b) 

Here the sign of / (x) — / (x 0 ) is the 
same for x > x 0 and for x <. x 0 ; it is 
determined by the sign of / 1V (s 0 ). If 
f IV (zo) is positive, we have a minimum 
(just as in the case of the function 
y = x 4 ; see Figure 1.5.1), and if / IV (x 0 ) 
is negative, we have a maximum . 

The attentive reader has probably 
already guessed that if for x = x 0 the 


first nonzero derivative is of odd order 
(first, third, fifth, etc.), then the func- 
tion has neither a maximum nor a 
minimum at this point, while if the 
first nonzero derivative is of even order 
(second, fourth, etc.), the function has 
either a maximum or a minimum depend- 
ing on the sign of that derivative (a 
minimum if the derivative is positive 
and a maximum if it is negative). Here, 
as everywhere in this section, we con- 
sider only the maxima and minima that 
a function attains at points where it is 
represented by a smooth curve , that is, 
at points where the function has a deriv- 
ative (or even several of them, y'(x 0 ), 
y"(x 0 ), etc.). Such maxima and minima 
are sometimes called smooth ; other cases 
related to maxima and minima that are 
not smooth will be considered in 
Section 7.2. 

Let us consider some examples. 

Example 1 . Suppose that 

(a) y = A + B (x — a) 3 , 

(b) y = A + B {x - a) 4 , (7.1.3) 

where B =7^= 0. Find the maxima and 
minima of y . 

The derivatives of (7.1.3) are 

(a) y' =3 B (x — a) 2 , y" = 6B (x — a), 
with z/'(a)= y "{a)= 0, and y"'=6B^=0. 

.(b) y = 4 B (x — a) 3 , y"= 12 B (x — 
a) 2 , y = 24 B (x — a), with y' (a) = 
y "(a) = y"'(a)= 0, and y IV = 24 B ^ 0. 

In the Case (a) the first nonzero deriv- 
ative is the derivative of the third {odd) 
order, whereby here at point x = a the 
function has neither a maximum nor 
a minimum (the function has in general 
no maxima or minima since y'(x) = 0 
only at x = a), but it has an inflection 
point at x = a. In the case (b) the 
only nonzero derivative at point x = a 
is that of fourth {even) order and the 
function has a minimum for B > 0 
(y IV (a) > 0) and a maximum for B < 0 
{y lY <T 0), which also follows from the 
formula for the function. (Why?) 

Example 2. It is required to construct 
an open-at-the-top box of maximum 
volume using a square sheet of tin 
with side 2 a by cutting out equal 
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squares at all corners of the sheet and 
then bending the tin to form the sides 
of the box (Figure 7.1.1). What is the 
length of the side of the squares to be 
cut out? 

Let the length of the side of the cut- 
out squares be x. The volume of the box 
will depend on what kind of square we 
cut out and therefore it is natural to 
denote it by V (#). Let us compute this 
volume: 

V (x) = (2a - 2a;) 2 x 

= 4 (a — a;) 2 x , (7.1.4) 

which is true because the base of the 
rectangular box is a square with side 
2 a — 2a;, while the height of the box 
is x. 

Now find the derivative of (7.1.4): 
V'(x) — —8 (a — x) x + 4 (a — a:) 2 . 

Solve the equation V'(x) — 0: 

—8 (a — x) x + 4 (a — x) 2 = 0, 
or (a — x) (a — 3a;) = 0, 

whence x x — a and x 2 = a/3 . 

We note at once that the value x x — a 
is of no interest to us because then we 
wouldn’t have a box by cutting the 
sheet in that fashion. There remains 
x = a/3. Then 

t t / \ / a a r on o 

f (t)= 4 — T=^r-°- 593a3 - 

Here V'(al 3) = 0, and since 
V"(x) = 8x — 8 (a — x) — 8 (a — x) 

= 24a: — 16a, 


we have V"(al 3) == —8a <C 0. Conse- 
quently, the function V (x) has a max- 
imum at x = a/3. 

To summarize, the maximum value 
is obtained for x = a/3, that is, we 
have to cut out squares whose sides are 
1/6 the side of the original square. 

Let us compute V (x) for several x 
close to a/3 and tabulate the results: 


x 0.25a 0.30a 0.33a 0.40a 0.45a 

V(x) 0.562a 3 0.588a 3 0.592a 3 0.576a 3 0.540a* 


We see that small variations in the 
value of x near x = a/3, that is, near 
the value of x to which corresponds 
the maximum of the function, bring 
about very small changes in F, which 
means that the function near the max- 
imum varies very slowly. 

This is also evident from Taylor’s 
formula (7.1.1). If f(x) = 0 at the 
point of maximum or minimum of the 
function, (7.1.1) takes the form 

f(x) = f (x 0 ) + y /" (x 0 ) (X — X 0 ) 2 

+ {r(.x*){x-x (7.1.5) 

This series does not contain a term with 
x — x 0 : the smallest power of x — x 0 
on the right-hand side of (7.1.5) is 
(x — x 0 ) 2 , which is extremely small 
for x close to x 0 . In our example, a 
change in a; by 9% (from 0.33a to 
0.30a) causes a change in F by less 
than 1%, while a change in x by 24% 
(from 0.33a to 0.25a) causes a change in 
V by 5% . Therefore, if we are interested 
in the maximal value of the function 
and if we make a small error in finding 
x 0 from the equation V' (x) — 0 (for 
example, if we solved this equation 
in an approximate fashion), then this 
has but slight effect on the maximal 
value of the function. 

Note also that point x = a/J in the 
problem we are discussing here is 
precisely the maximum point . Common 
sense leads to such a conclusion. The 
function that expresses the volume of 
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Figure 7.1.2 

the box is, obviously, positive for all 
values of x between 0 and a ; at the 
“endpoints” x = 0 and x — a it vanishes 
(in the first case the box has zero 
height and in the second it has a zero 
base area). This implies that as x 
is changed from 0 to a, the quantity 
V (x) first grows and then diminishes. 
This means that somewhere in the 
middle the function becomes maximal, 
and the only “suspect” is the point 
x — a/3 (since only at this point does 
V' (x) vanish), which means that this 
point is the maximum point. To make 
this line of reasoning more graphic, 
we depict the curve V = V (x) in 
Figure 7.1.2 (at point x = a the graph 
touches the axis of abscissas since 
r (a) = 0). 

Example 3. Suppose we have an elec- 
tric circuit that consists of an emf 
source (the emf equal to u\ compare 
with Chapter 13), a resistor (the resis- 
tance of which is r, including the internal 
resistance of the emf source), and a 
variable resistor (resistance R) (Fig- 
ure 7.1.3). What will be the value of 
R if the power output of this resistor, 
W , is maximal? 

Since the total resistance of the 
circuit is r + R, the current j flowing 
through the circuit is, by Ohm’s law, 
7 = ul(r + R). The power output W = 
ju Rl where u R is the voltage across R , 
which by Ohm’s law is equal to jR. 
Substituting this value of u R into the 


n i d — 



expression for the power output 
we get 

We must find the maximum of (7.1.6) 
with R variable. We find the derivative- 
dW/dR: 

dW __ u 2 2 u 2 R u 2 (r — R) 

dJR (r+R) 2 (r+i?) 3 “ (r + i?) 3 * 

(7.1.7)* 

This yields dWldR = 0 at R = r. 

It can be easily seen that R — r 
corresponds to precisely a maximum 
of the function W (/?); this follows if 
only from the fact that for R < r the 
derivative (7.1.7) is positive and for 
R > r it is negative. Common sense 
also leads to the same result: for small 
R the energy output W is low because 
the resistance is low (in this case^ 
almost the entire energy output “resides’ r 
in r, while the presence of R changes 
almost nothing; if R = 0, then W = 0), 
while for R large, then, in view of 
Ohm’s law, the current j will be low, 
which means W will be low, too. Some- 
where between small R and large R 
lies the value of R for which the power 
output W is maximal, and this can 
happen only at R = r (see the graph 
of W — W (R) in Figure 7.1.4). 

Example 4. Using available boards,, 
we can build a fence of length l . How 
can we fence off a rectangular yard of 
maximal area using for one side the* 
wall of an adjacent building (Figure 
7.1.5)? 
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Figure 7.1.3 


Figure 7.1.5 
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Let two sides have length x. Then the 
third side is Z — 2x. The area of the 
yard is 

S (x) = (l — 2x) x = —2x 2 + lx, 

(7.1.8) 

whence S' (x) = —4x + Z. Clearly, 
S' (. x ) = 0 only at x = Z/4; here £"(#) 
= — 4 < 0, so that at x — Z/4 the 
function S (, x ) has a maximum. 

This result can also be obtained without re- 
isorting to higher mathematics. Indeed, sup- 

minu- 

imum) of a second-degree polynomial (7.1.8) 
♦of variable x. But for any second ‘degree poly- 
nomial 

.y — ax 2 + bx + c (7.1.9 

we have 


:y==a ( x 2 + l x + ±} 



Since (x + 6/2a) 2 is nonnegative for all x's 
•equality occurring only at x — — bf2a , we 
find that y has a maximum if a is negative and 
this maximum is at x — — b/2a , and y has a 
minimum if a is positive and this minimum is 
-at x = — b/2a. In particular, the function 
■{7.1.8), with a = — 2, b = l, and c = 0, 
.attains its maximum at x = Z/4. 

The example we have just considered is 
one of the variants of Dido's problem. As le- 
gend has it, the mythical Elissa Dido, the 
daughter of the Tyrian king Mutton, after her 
husband was slain by her brother, fled to Cy- 
prus, and thence to the coast of Africa, where 
she purchased from a local chieftain, Iarbas, 
a piece of land on which she built Carthage. 
Iarbas agreed to sell a piece of land on the mock- 
ing condition that it be no larger than the 
area covered by an oxhide. But the cunning 
Dido did not cover the tiny piece of land by 
the oxhide, as Iarbas expected she should; in- 
stead, she cut the oxhide into thin strips and 
“fenced” off a large piece of land, which was 
made even larger by using the coastline. If 
we assume the coastline to form a straight 
line and require that the fenced-off area be 
a rectangle, then the problem reduces to the 
one we have just solved. However, the situa- 
tion complicates if we assume that the bound- 
ary of the area, the long strip made out of 
the oxhide, can have an arbitrary shape. In 



Figure 7.1.6 


this case the area whose maximum we are seek- 
ing depends not on one variable x but on an 
arbitrary function that specifies the shape of 
the boundary. It is possible in this case to 
employ the methods of higher mathematics, 
too, by introducing an analog of the concept 
of a differential and by proving that the points 
nt _ maxima^ a n d„ minima^ n £ _ t ha J 4 f u nc.ti.au_ n t . 
functions” considered (in mathematics such 
objects are known as functionals) correspond to 
the zeros of the “generalized” differential. 
With the aid of such methods (which cannot be 
discussed in a book as elementary as this) 
it can be proved that the general solution to 
Dido’s problem is a semicircle (Figure 7.1.6) 

Example 5. A hiker walking from tent 
A to camp-fire B wants to gather water 
in river How can he do this by 

traveling the shortest possible distance 
(Figure 7.1.7)? 

We have AA X = a, BB X = b , and 
A X B X = c; the values of a, b, and c 
are given and AA t || BB X _L A X B X . Let 
the broken line AMB be the path taken 
by the hiker. Our aim is to find out 
for what position of point M on the line 
A X B X is this path the shortest. To 
determine the position of M it suffices 
to specify the distance from M to point 
A 1? the foot of the perpendicular dropped 


from A onto the straight line re- 
presenting the river. Denote this dis- 
tance A X M by x. Then 

AM =y + 

MB = y b 2 + {c — x) z , 

and the distance S (.r) traveled by the 
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Figure 7.1.7 
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hiker will be 

S(x) =am+mb = Y¥+7* 

+ ]/fe 2 + (c — a:) 2 . (7.1.10) 

This yields 

S' _ x c ~~ x 

W Va 2 +x 2 y'bz + ic — x ) 2 * 

Equating S' (x) to zero yields 

- c ~ x = ^ . (7.1.11) 

Ya 2 + x 2 Yb 2 + (c — x ) 2 


It is easy to solve this equation. 
Squaring both sides, we get 

x 2 _ (c—x) 2 

a 2 + x 2 ~ b 2 + (c — x) 2 9 

or x 2 b 2 -{- x 2 (c — x 2 ) 

= a 2 (c— -x) 2 +x 2 (c — x ) 2 , 
that is, 

x 2 b 2 = a 2 (c — x) 2 , 
whence 


x 2 a 2 

(c—x ) 2 ~~b 2 9 


x 
c — 


X 


± x , or x t = 


ac 

a + b ’ 


X 2 = 


ac 

a — b 


Substituting the values of x 1 and x 2 
into the original equation (7.1.11), we 
see that the second root does not satisfy 
the equation. This is an extraneous 
root generated by the squaring pro- 
cess. 71 Thus, x = ad (a + b). 

It is possible, however, to give a 
pictorial geometrical representation 
that will enable us to obtain the answer 
without solving the equation. Rewrite 
the condition (7.1.11) thus: 


A X M MB X 
AM ~~ MB * 


(7.1.11a) 


But A x MlAM cos Z^A X MA — sin a. 
Similarly MBJMB = cos Z^B X MB = 
sin p. The condition (7.1.11a) means 


7 * 1 The extraneous root x 2 of Eq. (7.1.11) 
-can be dropped immediately because x/(c—x) 
is positive and, hence, is not — alb. 


that sin a = sin p, or, since both a and 
P are acute, 

a = p. (7.1.12) 

Thus, the hiker must take the path of 
a ray of light bounced off a mirror ( the 
angle of incidence is equal to the angle 
of reflection ). 

For a complete solution to the prob- 
lem it remains to demonstrate that 
for such a position of point M the dis- 
tance is indeed minimal (and not max- 
imal). This can be done by computing 
the second derivative of (7.1.10). 

But it is also possible to reason dif- 
ferently. From the expression (7.1.10) 
for S (x) we see that S (x) is pos- 
itive for all x’s and increases with- 
out bound together with the absolute 
value of x, irrespective of whether 
x > 0 or x C 0. And since 5"(x) van- 
ishes only for one value of x , it is clear 
that at this value of x the function 
S {x) has a minimum. In general, 
if in the interval we are interested in 
the first derivative of a function has 
only one root, obvious considerations 
often permit dispensing with a formal 
investigation by means of the second 
derivative (see what was said in this 
connection in Example 2). 

The problem in Example 5 can be solved 
in a purely geometrical manner without re- 
sorting to methods of higher mathematics. 
Referring to Figure 7.1,8, extend the seg- 
ment AAi to A' (A X A' = AA X ) and join A' 
with B. Then AM — A' M since A AA X M — 
AA X A'M. Therefore A'B = A’ M + MB = 
AM + MB. For any other point D on the 
segment A 1 B 1 we will have AD + DB = 
A'D + DB and A'D + DB > A'B, since a 
polygonal line is longer than any segment of 



Figure 7.1.8 
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a straight line. Consequently, the desired point 
M is the point of intersection of the straight 
lines A'B and A X B^ whence follows (7.1.12). 
(Why?) 

Examples 4 and 5 show that certain prob- 
lems involving the finding of maxima and min- 
ima may be solved by the tools of elementary 
mathematics, without resorting to differential 
calculus (see the text to the solutions to the 
corresponding problems printed in small type). 
But, first, not all problems can be tackled 
without appealing to higher mathematics and, 
second, the solution by elementary means re- 
quires a good deal of ingenuity, whereas high- 
er mathematics offers a standard method of 
solution of such problems. 

Do not get the idea that higher mathemat- 
ics does not require ingenuity! It will now 
be used for still harder problems. 

Let us now turn to the question of maxima 
and minima of functions of two variables. 

The definition dy = y'dx of the differen- 
tial of a function y = / (x) (see Section 4.1) 
shows that the points of (smooth) maxima or 
minima of the function are characterized by 
the fact that the differential vanishes at these 
points. This is closely related to the geomet- 
rical meaning of the differential (see Figure 
4.1.2) and the fact that the tangent to the 
curve representing the function at the point 
where the function is at its maximum or min- 
imum must be horizontal. It is equally easy 
to see that at the point of a maximum or a 
minimum of a function z = F (x, y) of two 
variables, the differential of that function, 

dz — dx + dy must vanish; in other 
dx dy 

words, dF/dx = dF/dy — 0. This is due to 
the fact that the plane tangent to the surface 
specified by the function z = F (x, y) is hori- 
zontal at the point of maximum or min- 
imum (Figure 7.1.9a), 7 * 2 as well as to the 
fact that at the point (£q, y 0 ) where the 
function z = F (x, y ) attains a maximum 
the functions f (x) = F (x, y 0 ) and g (y) = 
F ( x 0 , y) of one variable (x in the first 
case and y in the second) attain a maxi- 
mum, too, that is, f' (x) (—dF/dx) and g' (y) 
(—dF/dy) must vanish at this point. However, 
one must bear in mind that while the fact that 
f(x) = 0 almost always means that / has a 
maximum or a minimum (since cases where the 
second derivative vanishes are exceptional), 
the conditions dF (;r 0 , y Q )/dx = dF (a: 0 , y 0 )/dy 
— 0 (or dz = 0) might mean that one of the 
abovenoted functions of one variable, say 
/ (x), attains a maximum at (x 0 , y 0 ) and the 


7 * 2 The plane that is tangent to surface II 
(represented by the equation z = F (x, y)) 
at point M = M (;r 0 , y 0 ) contains the tangent 
lines to the curves y — y 0l z = F (x, y 0 ) = 
/ (x) and x = x 0 , z = F ( x Q , y) = g (y) (to 
the curves y — constant and x = constant); 
of all the planes that pass through Af, the 
tangent plane lies closest to II. 



other, g (y), a minimum. In this case the point 
(x 0 , y 0 ) constitutes a saddle point of the sur- 
face z — F (x, y) and at this point F attains* 
neither a maximum nor a minimum (Fig- 
ure 7.1.9&). 


Exercises 

7.1.1. We want to build a box out of a rec- 
tangular sheet of tin of sides a and b cutting- 
out equal squares at the corners. What must 
the side of a square be so that the box is of 
maximum volume? 

7.1.2. Inscribe in an acute-angled triangle 
with base a and altitude h a rectangle of the 
largest possible area, two vertices of which lie 
on the base of the triangle and the other two 
on the sides of the triangle. 

7.1.3. Determine the greatest possible area 
of a rectangle inscribed in a circle of radius R. 

7.1.4. For what radius of the base and for 
what altitude will a closed cylindrical can of 
a given volume V have a minimum total sur- 
face area? 

7.1.5. Two bodies are moving along the 
sides of a right angle with constant speeds v x 
and v 2 (meters per second) in the direction of 
the vertex, from which, at the beginning, the 
first was at distance a meters and the second, b 
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meters. How many seconds after they started 
will the distance between the bodies be at a 
minimum? 

7.1.6. Prove that the product of two posi- 
tive numbers whose sum is a constant is the 
greatest when the factors are equal. 

7.1.7. A straight line l divides a plane into 
two parts (medium I and medium II). A body 
moves in medium I with a speed v ± and in 
medium II at a speed v 2 . What path must the 
point take so as to get, in minimum time, from 
a given point A of medium I to a given point B 
of medium II? (This problem poses the question 
of the law of light refraction when light passes 
from medium I to medium II, and the speed 
of light in these media is different; it is known 
that the path the light takes between points A 
and B is always such that the time taken is a 
minimum ( Fermat's principle ; see footnote 
3.21)). 


7.2 Other Types of Maxima and Minima. 
Salient Points and Discontinuities. 

The Left and Right Derivatives of 
a Function 


Up to now we have said that maxima 
and minima of a function occur at 
values of x for which the first derivative 
vanishes. However, maxima and minima 
can also arise for values of the argument 
that do not make the first derivative 
vanish. 

We revert to Example 3 of Section 
7.1, where an emf source, resistance r, 
and variable resistance R are connected 
in series (Figure 7.1.3). For what value 
of R will the power output W of r be 
maximal? 

In a manner quite similar to that of 
Example 3, we get, from (7.1.6) 


W = / 2 r = 


u 2 r 

( r + B )2 * 


(7.2.1) 


The reader will recall that u and r are 
assumed known and R variable and 
unknown. Equation (7.2.1) implies that 


dW 2 u 2 r 

dR ~~ ( r+RY 


with the result that the condition 
dWIdR = 0 becomes 

— = o 

(r+R)* U> 


which cannot be satisfied for a single R. 

Does this mean that the power can 
increase without bound and that the 



problem of maximum power does not 
have a solution? From the physical 
essence of the problem it follows that 
the power output will be maximal at 
R = 0 (in this case W = u 2 /r). Why 
did we not get the value R = 0 from 
the equation dWIdR = 0? 

To see why, consider the graph of 
W = W (R) (Figure 7.2.1). It is evident 
from the graph that if R could assume 
negative values, then for R — 0 there 
would be no maximum. But from 
physical considerations it follows that 
negative R have no meaning — every 
physical problem presupposes that R 
is nonnegative. Thus, W has a maximum 
at R = 0 because the range of the 
independent variable is bounded. This 
means that if the range of the indepen- 
dent variable (the domain of the func- 
tion) is bounded, we must take into 
consideration the boundary values of 
the independent variable when testing 
for maxima and minima. 

Let us touch once more on the exam- 
ple of specific heat capacities discussed 
in Section 2.6, only now we will take 
water instead of diamond. The quantity 
of heat (in joules) that is required to 
heat 1 kg of water (at atmospheric 
pressure) from 0 to T °G is given ap- 
proximately by the formula Q ( T ) = 
4186.68T + 8373.36 X lO~ b T 2 + 1256 
X 10- 6 r 3 , which implies that the spe- 
cific heat capacity of water, c = c (T 7 ), 
at a temperature T is 

c{T) = ^L = 4186.68 

+ 16746.72 x 10- 5 f + 3768 x lO" 6 ?’ 2 

(see Exercise 2.2.2 and its solution). 
Clearly, for T positive c ( T ) increases 
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with T. Does this mean that c ( T ) for 
water may be as high as desired? Of 
course not, since water can exist in 
the liquid state only in the temperature 
interval from 0 to 100 °C, and the spe- 
cific heat capacity is maximal at the 
right endpoint of the interval and 
minimal at the left endpoint. 

Let us now return to the question 
of the maxima and minima that a 
function attains at its boundaries. 
When a maximum (minimum) is at- 
tained at an endpoint x 0 of the domain 
of a function y — f (x), the series 

1 0*0 — 1. (*o) = /' ( x o) ( x — x o) 

+ jf"( x o)( x ~ x o) 2 + 

may begin with the term with ( x — x 0 ). 
Therefore, if a maximum (minimum) 
of the function is obtained when x = x 0 
and we have departed somewhat from 
x 0 (an endpoint of the domain of the 
function), we may err considerably in 
determining y, since the error will be 
proportional to x — x 0 and not to 
(x — x 0 ) 2 , as was the case in Section 7.1. 
Hence, even a slight error in determin- 
ing the value of the independent vari- 
able that yields a maximum may lead 
to a sizable error in the value of the 
function. 

In the above examples the function 
f (x) existed for x C x 0 as well, but 
the values of the function for x <Z x 0 
did not interest us because they were 
devoid of any physical or geometric 
meaning. It may happen, however, 
that / (x) is simply meaningless for 
certain values of the independent vari- 
able. For example, there can be no 
meaning in an even-degree root, say 
a square root, of a negative number, 
with the result that the values of the 
independent variable that make the 
radicand vanish are the boundary val- 
ues. In a test for maxima or minima 
such values of the independent variable 
must be considered sep aratel y. For 
instance, if y = a — Y b — x , then 
y' = 1/2 Y b — x, and although y ' 
does not vanish anywhere, y has a 


maximum. The maximum is attained 
at x = b; indeed, we see that y — a 
nullifies the radicand, while x cannot 
be greater than b because otherwise 
the radicand would become negative. 
At x — b we have y = a; for other 
values of x the value of y is obtained 
through subtraction of a positive num- 
ber 7 - 3 ]/ b — x from a and is therefore 
smaller than a. 

A maximum (or minimum) may 
also occur at interior points where 
the derivative does not vanish. This 
is the case when the curve has a salient 
point {corner). Such points occur, in 
particular, when the curve consists 
of two parts described by different 
formulas for x 3> x 0 and for x < x 0 . 
Here is an instance of a physical 
problem of this nature. 

Suppose a teakettle is being heated 
on an electric hot plate. Our problem 
is to determine the instant of time 
when the teakettle has the greatest 
amount of heat. For the sake of sim- 
plicity, we assume that the efficiency 
of the hot plate is 100%, which means 
that all the heat is delivered to the 
teakettle. We put the teakettle on to 
heat at time t = 0, at which time it 
had q joules of heat. 7 - 4 By the definition 
of the unit amount of heat Q (the 
joule), the amount of heat released by 
the hot plate is fRt, where j is the 
current in amperes, R the resistance 
in ohms, t the time in seconds. Hence, 
the amount of heat in the teakettle 
by time t will be (in joules) Q = q + 
fRt . 

At the time t = t 0 the water in the 
teakettle begins to boil, and the amount 
of heat accumulated by the teakettle 
will be Q 0 = q + j 2 Rt 0 • When the 
water boils, it begins to turn into steam 
and boil away. 7 - 5 The formation of 1 g 
of steam requires approximately 2256.7 


7.3 y ^ — x i s understood to be the posi- 
tive root. 

7 - 4 For zero we take the thermal energy 
of water at 0 °C. 

7.5 Formation of steam starts at less than 
100 °G, that is, even before boiling starts, 
but we ignore this fact. 
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joules of heat. Denoting by dm the 
quantity of water that boils away in 
time dt, we get 

dm ~ (7 2 i?/2256.7) dt 
cz. 45 X 10 ~ b j 2 R dt. 

And so in one second a total of dmldt ~ 
45 X 10~ 5 7 2 jR grams of water boil away. 
This requires 

418.68 — 

dt dt 

~ 418.68 x 45 x 10 ~ 5 pR 
~0.1884/' 2 i? J/s 

of heat because 1 g of water contains 
approximately 418.68 joules of heat 
at the temperature of boiling. There- 
fore, by time * > * 0 the amount of 
heat spent on transforming the boiled- 
away water into steam will be Q, ~ 
0.1884/' 2 i? ( * — * 0 ) joules of heat. 

We see that the amount of heat (in 
joules) in the teakettle is expressed by 
two different formulas: for f < f 0 (i.e. 
prior to boiling) it is Q = q + pRT, 
while for * > f 0 (after the water started 
boiling) it is 

Q ~ q + pRt 0 — 0.1884 pR (* — t 0 ) 

= q + pR (1.1884i 0 _ 0.1884*). 

The graph of Q — Q (*) is shown in 
Figure 7.2.2a. It is clear from the 




drawing that Q (£) has a maximum* 
when t = t 0l although the derivative* 
Q' ( t ) does not vanish at this value of 
t, that is, the tangent to the graph at 
point (t 0 , Q (^ 0 )) is not horizontal (the 
function Q — Q (t) has no derivative 
at t = t 0 and the graph does not have 
a tangent line at this point). 

The derivative of the function 
Q — Q (t) has a discontinuity at t — t 0 . 
Indeed, if we consider only values of 
t less than f 0 , we must assume that 
Q'(t) = y 2 R, while for t > t 0 the deriv- 
ative is Q'(t) ^ — 0.t884/ 2 i?. The graph 
of the derivative is shown in Figure- 
7.2.2 b. 

This example shows that a maximum 
(or minimum) may occur if the deriv- 
ative is discontinuous, that is, if the 
curve representing the function has a 
salient point (corner). 

From Figure 7.2.3 it is clear that a 
minimum (or maximum) may occur 
for those values x 0 of the independent 
variable at which the derivative has 
an infinite discontinuity (in the pre- 
vious example the discontinuity was 
finite). (We have depicted in Figure 
7.2.3 the graph of the function y = 

x 2/3 = V # 2 ; the inverse of this func- 
tion,. or y = x 3 / 2 , or y 2 = x 3 , is a 
semicubical parabola ; see Section 1.5 
and, in particular, Figure 1.5.7.) A 
point of this type is called a cusp ; the' 
function y = x 2 / 3 has a minimum at 
this point. The graph of the derivative 
of y — x 2 / 3 is shown in Figure 7.2.4. 
Here, as in the case of an ordinary 
minimum, y' < 0 for x < x 0 (in Figure 
7.2.4, x 0 — 0); the function falls off 
as x approaches x 0 from the left. For 
x > x 0 we have y r > 0, and the func- 


Figure 7.2.2 
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(Figure 7.2.4 

tion increases with x after the value 
.x = x 0 has been passed. But at x = x 0 
it becomes meaningless to speak of 
a derivative. The derivative becomes 
arbitrarily large for x close to x 0 and 
greater than x 0 and arbitrarily large 
in absolute value but negative for x 
-close to x 0 and smaller than x 0 . 

The maxima and minima attained 
for values of the independent variables 
when the derivative is discontinuous 
are called cuspidal. Cuspidal maxima 
and minima and the maxima and mini- 
ma attained at the endpoints (boundary 
points) of the domain of the function 
can also be said to be nonsmooth. 

In connection with this consideration 
of singular points on curves, primarily 
salient points (see Figure 7.2.2 a), we 
<&an make precise our reasoning that 
led us to the concept of the derivative. 
In Chapter 2 we considered only smooth 
curves without specially stipulating 
this fact. The derivative y' ( t ) taken 
at the point t is equal to the limit of 
the ratio 

y(t 2 ) — y(t i) in 2 

as t 2 and t 1 tend to t (it is clear then 
that the difference t 2 — t x tends to 
zero). We have specially emphasized 
that this limit does not depend on how 
t 2 and t x are chosen: they can both 
be greater than t or both smaller than t 
or one greater and the other smaller 
than t or one equal to t and the other 
.greater or smaller than t. Indeed, when 
we take At positive, the fraction 


— — ^ — corresponds to t x = t 

and t 2 = t At > t, whereas when 

we take the fraction — — , we 

A t 

have t x = t — At < t and t 2 = £. 7 * 6 

If y = y (t) is a smooth function, 
all these fractions yield the same limit, 
which is equal to the derivative at the 
given point. The situation changes 
when we deal with a curve with a 
salient point (see Figure 7.2.2). If 
by t 0 we denote the value of t at which 
the salient point occurs, then, taking 

— 0 we get, for At pos- 

itive and tending to zero, a definite 
quantity (in the example with the 
teakettle this quantity is equal to 
— 0.1884/ 2 i?) called thej derivative on 
the right of the function y (£) at point 
t = t 0 . On the other hand, taking 

— — , we get, for At pos- 
itive and tending to zero, another 
limit (equal in the aforementioned 
example to j 2 R) called the derivative 
on the left of the function. 

Taking t 2 and t x on different sides of 
£ 0 , we can obtain different values of the 
ratio (7.2.4) as t 2 — 1 0 and as t x 1 0 ; 
it is easy to see that if A is a corner 
or a cusp, then the chord BC , where 
B and C lie on different sides of A , 
may have different directions (imagine 
such a chord in Figures 7.2.2a and 
7.2.3), This means that the limit of 
the chord, as B and C tend to A, may 
also be different— it depends on how 
precisely the points B and C tend to 
point A. Thus, at a salient point of 
a function the derivative of that func- 
tion has no definite value, but the deriv- 
atives on the left and on the right 
can be found unambiguously. 


7 « 6 For smooth curves the derivative was 
calculated as the limit of the ratio + 

— y^t— yJJ/Ai as At-* 0 (see the 

text in small print at the end of Section 6.1); 
here t x = t — At/2 <C t and t 2 = t + At/2 >>£. 
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Figure 7.2.5 

In Chapter 2, when we first began 
studying derivatives, we simplified mat- 
ters by not assuming all the time that 
the derivative has a definite value 
irrespective of the mode of approach 
of A t to zero (from the left or from the 
right) only for points at which the 
curve representing the function is 
smooth. As is evident from Figure 7.2.26, 
the curve of the derivative y'(t) has 
a discontinuity at the point where the 
curve y ( t ) has a salient point. Now if 
we replace the salient point on the 
curve y ( t ) by an arc of small radius 
that is tangent to the curve on the left 
and on the right (what draftsmen call 
conjugation ), the resulting smooth curve 
will correspond to a continuously vary- 
ing derivative; however, on the range 
of t where the curve y ( t ) is replaced 
by the arc the curve y'(t) changes di- 
rection sharply (compare Figures 7.2.5 
and 7.2.2). 

If the curve y ( t ) has a discontinuity 
at point t 0 (Figure 7.2.6a), then we 
can say that at t 0 the derivative y'(t) 
is infinite (although actually the func- 
tion has no derivative at this point). 
Indeed, if the discontinuity is replaced 
by a smooth variation of y from y x to 
y 2 over a small interval from t 0 — e 
to t 0 -f- e, then on this interval the 
derivative is equal to (y 2 — y^)l2z, 
which is a very large quantity, increas- 
ing as 8 decreases (Figure 7.2.66). 



Figure 7.2.6 

b 

Now how does the integral j y ( t ) dt 

a 

behave when the function y(t) is not 
smooth? If the function has a salient 
point, then no new problems arise 
when we compute the area bounded by 
the curve y (t). In Section 3.2 we split 
up the definite integral (the area under 
the curve) into rectangular strips with 
an area y ( t n ) (t n+1 — t n ) or y ( t n+1 ) X 
(*n+i — t n ). In the limit, as the in- 
tervals, that is, the differences t n+1 — 
t n , get smaller and smaller, it makes 
no difference whether one takes y (t n ) 
or y (tn+i) either in the case of a 
smooth curve or in the case of a curve 
with salient points. 

If a curve y ( t ) is discontinuous at a 
point t — t 0 but remains bounded, 
then for the interval that contains the 
discontinuity ( t n < t 0 < t n+1 ) the 
quantities y ( t n ) and y (t n+1 ) remain 
distinct no matter how t n and t n+1 
approach one another. To summarize, 
then, in the expression of the integral 
as a sum, the value of one of the sum- 
mands in this case to a great extent 
depends on how the sum is taken: 
by formula (3.2.1) or by formula (3.2.2). 
However, as f n+1 — t n tends to zero, 
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the summand itself tends to zero, 
whereby the limit of the sum, the 
integral, has a definite value (indepen- 
dent of the way in which the sum was 
computed) also in the case where the 
integrand has a discontinuity in the 
domain of integration. Here, of course, 
we assume that the values of the func- 
tion to the left and to the right of the 
discontinuity and, hence, the size of 
the discontinuity are finite, the func- 
tion does not behave as y = l/;r does 
in the neighborhood of point x = 0. 

The relationship between the integral 
and the derivative is likewise preserved. 
Referring to Figure 7.2.2, let us take 
the function Q'(t) (whose graph is 
shown in Figure 7.2.2 b) and denote 
it by / (£). Then the function Q (t), 
whose graph is shown in Figure 7.2.2 a, 

is the integral Q ( t ) = j/ ( t ) dt. This 

example shows that a discontinuity in the 
integrand function / (t) = Q'(t) leads to 
a salient point in the integral Q ( t ) of 
this function. The definite integral of 
a function with a discontinuity of this 
type can be found with the aid of the 
indefinite integral by the general rule 

b 

j f(t)dt=Q(b)-Q(a). 

a 

We may continue: consider Figure 7.2.6. 
We can say that for a function tending to in- 
finity on an interval tending to zero (Fig- 
ure 7.2.66), 7 - 7 the integral is a discontinuous 
function (Figure 7.2.6a). However, in this 
case we must make precise the law by which 
the function tends to infinity and the interval 
to zero. We will not dwell on that here. Exam- 
ples of this kind (a fuller consideration of which 
requires additional refinement) lead to the 
concept of the delta function (see Chapter 16). 

Thus, the final scheme for solving 
problems that involve finding the max- 

7 - 7 The expressions “an interval tending to 
zero” and “a function tending to infinity” point 
to the fact that in essense we are speaking not 
of a single function but of a family of functions 
with the following property: as we go over 
from one function in this family to another, 
the “growth interval” becomes more and more 
narrow and the “degree of growth” higher and 
higher. For more details see Chapter 16. 


ima and minima of a function goes as 
follows. First we must find the deriv- 
ative of the function under investi- 
gation and determine the stationary 
points, that is, points at which the 
variation of the function is the small- 
est— points at which the derivative 
vanishes. But in addition to these 
points, “suspects” (i.e. points at which 
the function may be maximal or minim- 
al) are the boundary points of the 
domain of the function, salient points, 
and points at which the derivative 
becomes infinite and changes its sign 
(“cusps”). We must then check how 
the function changes at all these points. 
If at a point where /' (x) = 0 the 
graph is represented by a smooth curve, 
information on the type of the point 
(maximum, minimum, inflection) can 
be obtained through the second deriv- 
ative of the function (or higher-order 
derivatives if the first and second 
derivatives vanish at this point simul- 
taneously). In other cases an idea about 
the behavior of the function in a point 
being considered may be given by the 
one-sided (i.e. left or right) derivatives; 
it is clear that at the boundary points 
of the domain of a function only one- 
sided derivatives can exist. Finally, 
if we are interested in the absolute 
maximum and/or minimum of a function 
over the range of variation of the 
function, that is, the largest of all the 
(local) maxima or the smallest of all 
the (local) minima, we must compare 
the values of the function at all points 
of local maxima (or minima). 

Exercises 

7.2.1. Find the smallest value of the func- 
tion y = x 2 — 2x + 3 as x varies from 2 to 10. 

7.2.2. A fisherman F is sitting on the bank 
of a river and fishing. Going down the river, 
at a distance h from the place where F is sit- 
ting and with speed u, there is a steamer S of 
length l, and at time t = t 0 the bow of the 
steamer is positioned exactly opposite F. 
(We ignore the fact that S has width and as- 
sume that S moves along a straight line paral- 
lel to the bank.) For the distance between F 
and S it is natural to take the distance from S 
to the point on S closest to F. How does this 
distance D vary with time? When will D — 
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D ( t ) be minimal? Draw the graph of the func- 
tion D = D {t) for h = 300 m, 1—6 0 m, 
and v = 5 m/s. 

7.2.3. Find the cuspidal maxima _of the 
following functions: (a) y = — 5) x 2 , and 

(b) ;/ = 1 — 

7.3 Investigating Maxima and Minima 
of Functions Depending on a Parameter 

In Sections 7.1 and 7.2 we separated 
the cases of “internal” maxima and 
minima, which are attained by a func- 
tion in interior points of the domain 
of the function (the range of the in- 
dependent variable on which the func- 
tion depends), and maxima and minima 
attained by the function at the end- 
points of the domain of the function. 
However, in studying functions one 
may encounter problems in which both 
types of maxima (or minima) are pres- 
ent. Situations of this kind often occur 
when the function we are interested 
in depends not only on the indepen- 
dent variable but also on a parameter, 
and a variation in this parameter within 
the limits allowed by the problem 
changes the nature of the maxima (or 
minima) or even the nature of the 
function itself. To illustrate what we 
have just said there are several exam- 
ples. 

Example 1. Select a point M on the 
diameter AC of a circle a. What must 
the angle a be between the diameter 
and a chord BD of a passing through 
point M so that the quadrangle ABCD 
inscribed in a is of maximal area 
(Figure 7.3.1)? 



Figure 7.3.1 


We take the radius of a as the unit of 
length; the distance OM from the center 
of circle 0 to point M is denoted by a. 
Obviously, the altitudes BK and DL 
of AABC and AADC are, respective- 
ly, MB sin a and DM sin a, which 
yields 

S ABCD = S = S& AB c 4- S& AD c 
= y AC • BM sin a + y AC • DM sin a 

= y AC (BM -\-DM) sin a 

= Y AC-BD sin a = BD sin a, 

since, obviously, AC = 2. 

Moreover, the distance OP ( —b ) from 
the center of a to the chord BD is 
OAfsina = a sin a, which yields BP — 
PD — \ 1 — a 2 sin 2 a (since OB =* 
OP — 1) and , hence, BD = 
2]/ 1 — a 2 sin 2 a, that is, 

S = BD sin a — 2 sin a V 1 — a 2 sin 2 a. 

(7.3.1) 

The problem of finding the maximum 
of S becomes really simple if we go 
over from S to the square of the area 
S 2 = 4 sin 2 a (1 — a 2 sin 2 a), since it 
is clear that S and S 2 attain their 
maxima (or minima) simultaneously, 
and introduce a new variable, x — 
sin 2 a. Then 

S 2 - 4z (1 - a 2 x). (7.3.2) 

Since in the #y-plane (y — S 2 ) the graph 
of (7.3.2) is a parabola , we can do with- 
out finding the derivatives; however, 
since there is a general (and simple) 
method for finding the maxima and 
minima of a function through differen- 
tiation, we use this method. Clearly, 
if F(x) = 4r — 4 a¥, then F'(x) — 
4 — 8a 2 x and F'(x) = 0 at x = l/2a a . 
And since S (a) (and, hence, S 2 (a) = 
F ( x )) vanishes at a = 0 and a = 180° 
and between these two values is posi- 
tive, we conclude that it attains a max- 
imum (see the text at the end of Exam- 
ple 2 in Section 7.1) somewhere in this 
interval; precisely, we should expect 
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that the maximum of S (a) and that 
of F (x) is attained at x — sin 2 a — 
l/2a 2 . 

But it is clear that sin 2 a is equal 
to 1/2 a 2 only for l/2a 2 ^ 1, or 2 a 2 ^ 1, 
or 2/2, and in this case the 

solution of the problem is indeed sup- 
plied by the condition sin 2 a = 1/2 a 2 , 
or sin a = +'|/^2/2a; there are two 
chords, BD and BJ)^ that are sym- 
metric about AC and form the approp- 
riate angle a with AC (at a = ]/ 2/2 
merge into one chord BD J_ AC). For 
such a value of x the function F (. x ) 
has an “internal” {smooth) maximum. 
For a < Y 2/2 the function F {x) varies 
monotonically with x varying between 
0 and 1 (since x = sin 2 a), whereby 
the greatest value is attained at the 
boundary point x = 1, corresponding 
to a = 90° (since x = sin 2 a): here 
only one chord, BD , which is per- 
pendicular to AC , yields a maximum 
for S. We note also that in view of 
(7.3.2) the value S = S m ax is expressed 
differently for the two cases we are 
considering: for a ^ Y 2/2 we have 
F max = F (1/2 a 2 ) = 4 X (l/2a 2 ) X 
(1 — 1/2) = 1/a 2 , which means that 
In this Case ^max == Y -Fmax === 1/^j 
but if a ^ Y 2/2, then /'max = F (1) = 
4 (1 — a 2 ) and, hence, S max = 

Y ^max = 2 ]/ 1 — a 2 . Thus, the 
graph for 5 max = S (a) (in the aS-plane) 
is represented, on the segment 0 ^ 
a^l, by a “broken” curve consisting 
of two different arcs: the arc of the 
“ellipse a 2 -f (S/2) 2 — 1 on the segment 
0 ^ a ^ Y 2/2 and the arc of the 
hyperbola S = 1/a on the segment 
Y 2/2 ^ a ^ 1 connected to the first 
arc. 

Note also that the square T inscribed 
in a for which AC is the midline cuts 
out of this midline a segment L 1 L 2 , 
with OLi = OL 2 = Y 2/2; thus, if 
point M lies inside T , then the sought 
chord BD is perpendicular to AC , but 
if M lies outside T , then the first case 
of the above two occurs. Here b == 


OP = a sin a = a Y 2/2a = ]/ 2/2 = 
constant, and in this way the distance 
between BD and the center 0 of circle a 
is constant and is equal to the distance 
from 0 to a side of square T\ the chords 
BD and B 1 D 1 that pass through point 
M , which is exterior with respect to T 
(lies outside T) and lies on AC, touch 
the circle a x inscribed in T. 

This result, which appears unexpected at 
first glance, becomes crystal clear if we draw 
the graph of the function y = F (x) — 4r — 
4 a 2 x 2 = — a 2 (x — 1/2 a 2 ) 2 + l/4a 2 , a straight 
line at a = 0 and a parabola with its vertex 
at point Q ( 1/2 a 2 , 1/4 a 2 ) for a =^= 0 (Fig- 
ure 7 . 3 . 2 ). For a> / 2/2 the vertex Q lies in- 
side the domain of F (x), 0<a:<l, and the 
function attains its maximum at this vertex, 
while for a < j/2/2 the point Q lies outside 
the domain, and the maximum is attained at 
the boundary point x = 1. Note also that for 
a > 1 ^ 2/2 the value x — 1 of the independent 
variable corresponds not to a maximum of 
F (x) but, on the contrary, to a (local) min- 
imum because for values of x close to unity 
but less than unity (for values of a close to a 
right angle) the magnitude of F (x) (the area 
S (a)) will be larger than at point x = 1 (at 
a = 90°). 

Of course, in solving this problem there 
was no need to go over from function S (a) = 
2 sin a ]/l — a 2 sin 2 a to function F (x). 
The derivative S (a) with respect to a can be 
found by the rules of differentiation. The re- 
sult is 

dS 2 cos a X (1 — 2a 2 sin 2 a) m 0 ON 

— : — — /- : ~ — . ( i .o.O) 

da y i — a 2 s i n 2 a 

From ( 7 . 3 . 3 ) we see that the fact that S' (a) 
vanishes is equivalent to cos a = 0 or to 
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sin a = ± 1/a l/2. The first case corresponds 
to a = 90°, that is, a chord BD 1. AC, 
while the second case is possible only if a ;> 

2/2 (~0.71). Accordingly, we arrive at differ- 
ent solutions to this problem depending on 
whether a> j/2/2 or a < |/2/2. For a > 
l/2/2, the function S (a) vanishes at a single 
point, a = 90°, and since the function S — 
S (a) must have a maximum, there can be no 
doubt that a = 90° corresponds to precisely 
a maximum of the function. However, when a 
is less than ]/ 2/2, the derivative S' (a) van- 
ishes at three points: at a = 90° and at 
sin a = ±l/]/2a. If we check the sign of 
S" (a) (or check the sign of S' (a) in the neigh- 
borhood of points a = 90° and a = 
+ arcsin (1/ j/2 a)), we will see that in this case 
the maxima are attained precisely at points a = 
dt arcsin (1/ ]/ 2a), while at point a — 90° 
the function has a local minimum (Fig- 
ure 7.3.3). 

Here is another aspect in which Fig- 
ure 7.3.2 differs from Figure 7.3.3. In the for- 
mer, point x = 1 is a boundary point of func- 
tion F (x), and this function has no geometri- 



cal meaning when x is greater than unity; 
the maximum (or minimum) of F (x) at this 
point is a “boundary” one, since here F' (1) =£ 0. 
In the latter, on the other hand, point a = 
90° is an interior point for S (a); the maximum 
(or local minimum) attained by S (a) at this 
point is a smooth one. This should not be a cause 
for surprise, since it is clear that the functions 
S (a) and S 2 (a) (= F(x)) attain their maxi- 
mum and minimum (since S > 0) in the same 
point, but the nature of the maximum (min- 
imum) may change. For instance, all three 
functions y = j/| * [, V\ = y 2 = I z I, and 
y 2 — yl — x 2 (Figure 7.3.4) have the same 
minimum point x = 0; however, y has at this 
point a cuspidal minimum (both the left and 
right derivatives become infinite at this point), 
y 1 has a “corner” minimum (the left and right 
derivatives are nonzero and do not coincide), 
and y 2 has a smooth minimum at x = 0 
(here y' 2 (0) = 0). Indeed, if Y = F (x) = 
y 2 (x), then dY/dx = 2 yy' , and the conditions 
Y' = 0 and y' = 0 are not equivalent. Anoth- 
er reason why the nature of the maximum 
or minimum may change is the change in the 
function itself (in addition to this, when we 
went over from S to F we simultaneously 
changed the independent variable from a to x = 
sin 2 a): since if y = / ( x ) and x = cp (t), we 
have dy/dt = (dy/dx) (dx/dt), and dy/dx = 0 
is not equivalent to dy/dt = 0. 

Figure 7.3.5 depicts the graph of 
the function a = a (a), where a is 
defined by the condition S AB cd = 
S = 5 max ; in a certain sense this 
drawing summarizes the information 
we obtained during solving the prob- 
lem. The reader can clearly see that 
if a ^ ]/ 2/2, the problem of finding 
the quadrangle ABCD with the great- 
est possible area has a unique solution, 
while if a > j/2/2, there are two 
solutions, and they correspond to two 
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Figure 7.3.5 
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chords BD and B 1 D 1 symmetric about 
AC. (Note that such a “splitting”, or 
branching, at a critical point of one 
solution into two, an event correspond- 
ing to a change in the nature of the 
solution to the problem, is rather often 
encountered in mathematics.) Also, the 
very fact that at the critical point a = 
]/ 2/2 the point of maximum a = 90° 
of the function S = S (a) transforms 
into a point of (local) minimum is 
inherent in many problems and not 
only in this, rather randomly chosen, 
problem. 

Of course, this problem, too, can be solved 
without resorting to differential calculus, that 
is, by purely geometrical methods. It is ob- 
vious that 

Sabcd — S 

~ S AMAJ3+ S A MBC+ S AMCD~^~ S AMD A 
— ysin a X ( MA-MB + MB-MC 
+MC-MD+MD-MA) 
t=-^-(AC -BD) sin a = BD sin a; 

on the other hand, 

1 1 

S AObd “ s = ~2 BD'OP — -g- BD-a sin a 
1 

= — a*BD sin a, 

that is, s = (a/2) S, which means that s and 
S attain their maxima simultaneously. But 
since 

1 1 

s = — OB -OD sin Z BOD = — sin (3 

Li h 

where P = A BOD, it remains to he estab- 
lished at what point sin P attains its maximum. 
Clearly, if p can be equal to 90° (sin p = 1), 
this will correspond to the absolute maximum 
of 5 (and, hence of S , too). But p = z OBD = 
90° ifjhe length of the chord BD of a is equal 
to Y 2, the length of the side of the square T 
inscribed in cr, or if the distance OP from the 
center of a to BD is equal_to half the side of T 
(the half, of course, is l/*2/2 units long), that 
is, when BD touches the circle inscribed in 
T, which is obviously impossible if point M 
belonging to AC lies inside the square T (with 
midline AC) inscribed in cr. Thus, when point 
M belonging to AC lies outside T , the prob- 
lem is solved by drawing through M the 
straight line BD tangent to (either BD or 
#i£>i) for which P = 90°; but if M lies inside 
T, the situation changes. 


In the latter case the distance OP from 
the center O to BD is equal, as we saw earlier, 
to a sin a, that is, it can vary from 0 (the case 
where a = 0 and BD coincides with AC) to a 
(the case where a — 90°, or BD 1. AC). Ac- 
cordingly, the length d = BD of chor d BD 
can vary from 2 (when BD == A C) to 2|/’l — a 2 
(when BD _L A C); the angle P varies from 
180° (when BD = AC) to 2 arccos a (when 
BD _L AC), that is, remains constantly ob- 
tuse (since a is less than )/*2/2, we conclude 
that arccos a is greater than 45°). The quan- 
tity sin p attains its maximum at P = p m i n = 
2 arccos a, that is, at BD A. AC, and it is 
this chord BD that ensures that the area of 
A BCD is at its maximum. 

Of course, the unnatural solution that we 
are discussing here is much more difficult to 
comprehend than the elegant solution involv- 
ing finding the derivative of S (a) (or F (x)), 
since one must guess that the areas of S and s 
are proportional, and this is not an easy job. 

Example 2. Let the product of the 
distances from a (variable) point M 
to two fixed points A and B be a . What 
is the greatest distance d = MP = 
d m ax from M to the straight line A B, 
and how to specify the point M for 
which this distance d max is realized? 

In many respects this problem is 
similar to the previous one, and for 
this reason we go into less detail. The 
set of all points M for which MA • MB — 
constant (=&) constitutes a curve 2 
called the oval of Cassini (Figure 7.3.6) 
in honor of the Italian astronomer 
Giovanni Domenico Cassini (1625-1712). 
(Being a follower of the heliocentric 
theory of the solar system, Cassini 
tried to correct the faults of Ptolemy’s 
system by substituting for Kepler’s 
ellipses as the paths of the planets curves 
like the one just mentioned; such curves 
are also called Cassinians ). If we in- 
troduce a plane system of coordinates 
in such a manner that points A and B 
have coordinates (—1, 0) and (1, 0) 
(in this way the distance A B between 
the points is equal to two units of 
length), then, in view of the fact that 
for M = M (; x , y), obviously, MA = 
]/ ( x -f- l) 2 + y 2 and MB = 

y ( x — l) 2 -f y 2 , the equation of curve 
2 has the form 

y {x- r l ) 2 + y*xV (s-i ) 2 + y 2 = a, 
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Figure 7.3.6 



which can be simplified thus: 

(, x 2 -f y 2 ) 2 — 2 (x 2 — y 2 ) = a 2 — 1 . 

(7.3.4) 

(Verify this.) 

It is clear that the oval of Cassini 
degenerates into a pair of points, A 
and B (in which case our problem has 
no meaning), if a = 0. If a is very 
small, then either the distance MA or 
the distance MB is very small (other- 
wise the product MA • MB cannot be 
small), and in this case the oval of 
Cassini splits into two ovals surround- 
ing points A and B. As a increases, 
the two ovals of Cassini increase in 
size and, at a — 1, finally touch at 
the origin 0, so that the entire curve 
resembles an “eight” lying on its side. 
This curve (in view of (7.3.4), its 

equation is ( x 2 -f- y 2 ) 2 = 2 (x 2 — y 2 )), 

was studied by Jakob Bernoulli and 

is called the lemniscate of Bernoulli . 
When a is greater than unity, the oval 
of Cassini is a single closed curve 
enveloping the two “poles” of the 

curve, A and B. Finally, when a is 
very large, the distances MA and MB 
are also large and will differ little; 
here the oval of Cassini resembles a 
circle of an extremely large radius 
Ya centered at 0 . 

Let us now find the derivative dyldx, 
where the dependence of y on x is given 


by the following equation: F(x , y) = 0, 
with F{x, y) = (. x 2 + y 2 ) 2 — 2(x 2 — y 2 ) 
— a 2 -f- 1 (cf. (7.3.4)). In view of 
(4.13.7), we have 

dy dF/dx 2 (x 2 -f- y 2 ) 2x — kx 

dx dF/dy 2 ( x 2 + y 2 ) 2 y + 4 y 


(x 2 -\-y 2 — 1) x 

(x*+y 2 +i)y' 


(7.3.5) 


whence dyldx = 0 if and only if x — 0 
or x 2 -f y 2 = 1. But at x — 0 the 
equation of the oval of Cassini, (7.3.4), 
yields y* + 2 y 2 = a 2 — l,or (y 2 +1) 2 = 
a 2 , from which it becomes completely 
clear that for a < 1 the oval of Cassini 
does not intersect the straight line 
x = 0 (the y axis); here dyldx = 0 
only at points for which x 2 -f y 2 = 1, 
or at points where the oval of Cassini 
intersects the unit circle S with diam- 
eter AB (since S has the equation 
+ y 2 — !)• If is clear that there 
are four points at which 2 intersects 
5, and these points are pairwise sym- 
metric about the x axis and the y axis. 
They necessarily yield the solution 
to our problem, since they lie farthest 
from the x axis. 

The points of intersection of 2 and 
S can be found by solving the following 
system of equations: 


x 2 + y 2 = 1 and 
2 (x 2 — y 2 ) = (x 2 -f y 2 ) 2 -f 1 — a 2 
= 2 -a 2 . 
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or 

x 2 y 2 = 1 and 
*2 — y 2 = 1 — a 2 12, 

which immediately yields 

x 2 — 1 — a 2 / 4 and y 2 = a 2 ! 4. 

This readily implies that these points 
exist only when a is no greater than 2, 
since x 2 is necessarily positive. Whence, 
for a > 2 the condition dyldx = 0 is 
satisfied only at x = 0, which means 
that the points of the oval of Cassini 
that lie farthest from the # axis (such 
points are sure to exist) are those at 
which 2 intersects the y axis. 

Finally, for 1 < a < 2, the deriv- 
ative dyldx on the upper half of the 
oval of Cassini (in view of the fact that 
2 is symmetric about the axis of abs- 
cissas we can always consider only the 
upper half of this curve) vanishes 
three times: at the point where 2 
intersects the axis of ordinates and at 
the two points where 2 and S intersect. 
If we study how dyldx changes sign 
near these points (or find the sine of 
cPyldx 2 ), we will see that y is at its 
maximum exactly at the points where 
2 and S intersect, while x = 0 corre- 
sponds to a (local) minimum of y (see 
Figure 7.3.6). Thus, the set of all 
points of maxima of y for different 
values of a consists of circle S and the 



part of the axis of ordinates external 
to S (Figure 7.3.7): for a ^ 2 the point 
of maximum of y belongs to circle S, 
and the closer a is to 2, the closer is 
the pair of points above the x axis 
to each other (the same, of course, is 
true of the pair of points below the x 
axis); at a = 2 the two points with 
y > 0 (and the two points with y < 0) 
merge, and for a greater than 2 the 
points of maximum of y belong to the 
y axis and not to circle S . (The reader 
can easily convince himself that the 
graph of y = y max (a) in the ay-plane 
consists of the segment of the straight 
line y = al 2 corresponding to 0 ^ a ^ 
2 and the arc of the parabola y 2 = 
a — 1 corresponding to the values of 
a greater than 2.) 

Of course, this problem can also be solved 
by the methods of elementary geometry, that 
is, without using differential calculus. It is 
clear that 

1 CL 

S A mab ~~2 MA ‘ MB sin a = y sin a, 
while on the other hand 
S A mab=J ab -MP = MP, 

where MP is the distance from M to AB, and 
a = A A MB, whereby the maximum of MP 
corresponds to the maximum of sin a. There- 
fore, if our curve 2 contains points M such 
that A A MB = 90° and sin a = 1, then these 
points yield the maximum of MP, since all 
points M of the circle S whose diameter is AB 
with A A MB = 90° are indeed the points of 
maximum of MP (of course, different points 
on S correspond to different values of para- 
meter a). Completing A MAB to parallelogram 
MAM X B centered at 0 readily leads to 

4 MO 2 = 2MA 2 + 2 MB 2 — AB 2 

= 2 (MA — MB) 2 + AM A -MB - 4 
= 4 {a — 1) + 2 (MA — MB) 2 
> 4 (a - 1), 

where we still assume that AB = 2. Thus, 
if a >> 2, then the equality MO = 1, which 
means that M belongs to 2 and to circle S 
with diameter AB and, hence, a = A A MB = 
90°, is simply impossible (while at a — 2 
only the points of intersection of 2 with the 
axis of ordinates, points for which MA — 
MB = 0, satisfy the condition MO = 1). 
Hence, while for a c 2 the maximum of the 
distance | y | = MP from point M on curve 2 
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to the axis of abscissas is attained at the points 
where 2 and S intersect, for a > 2 the situa- 
tion is different. 

The reader can easily see that if a > 2, 
then for each point M belonging to 2 the 
angle a = i _AMB is always acute (since 
point M lies outside circle S whose diameter is 
A B)\ therefore, we need only find a point M 
for which angle a (and, hence, sin a) is max- 
imal. By the laws of cosines applied to 
Z MAB, 

2MA • MB cos a 

= 2 a cos a = MA 2 + MB 2 — AB 2 
= (MA — MB) 2 + 2 MA • MB — 4 
= (MA — MB) 2 + 2a - 4, 

which implies that the smaller the value of 
cos a (and, hence, the greater the value of a), 
the smaller the quantity (MA — MB) 2 , whence 
the smallest possible value of cos a (and, 
hence, the greatest possible value of sin a) 
will be achieved at MA — MB — 0, that is, 
when M belongs to the y axis. Thus, for a > 2 
the maximum distance MP from M to 
AB is attained for the points on the axis of 
ordinates. This completes the solution to the 
problem. 

Example 3. Here is another example. It 
is more complicated than the previous two, 
has a serious physical meaning, and also deals 
with finding the maxima and minima of a 
function. It is well known that the relation- 
ship between the volume v, the pressure p , 
and the absolute temperature T (in kelvins) 
of an ideal gas is given by Boyle’s law and 
n vfaJ-L7Qssac 3 raw;"Vn\cii cai) oe ufiu'ed 'tnio 
the ideal gas law 

pv — BT y (7.3.6) 

where B is the molar gas constant of an ideal 
gas , the adjective “molar” meaning that when 
related to one mole of a gas, R is the same for 
all gases; for instance, if v is measured in m 3 
and p in Pa, then R ~ 8.25 N -m/K. However, 
formula (7.3.6) does not agree very well with 
the properties of real gases, and so a com- 
mon way to correct it is to write it in the 
form 

(?+-£-) (”->>) = RT > (7.3.7) 


which is known as the van der Waals equation 
of state. 1 • 8 The constants a and b are empirical 
(i.e. determined for each gas separately) pos- 
itive constants (in the system of units used 
above to define R the dimensions of these con- 
stants are N -m 4 and m 3 , respectively). If the 
temperature of the gas under investigation 
is kept constant, the formulas (7.3.6) and 
(7.3.7) correspond to certain curves in the vp- 


7 - 8 Johannes Diderik van der Waals (1837- 
1923), a Dutch physicist. 


plane known as isotherms (curves of equal tem- 
perature). Each value of T has its isotherm. 

It is clear that ideal-gas isotherms given 
by formula (7.3.6) for different values of (pa- 
rameter) T comprise a family of hyperbolas 
pv = constant (=RT) with asymptotes v = 0 
and p = 0; these isotherms have neither max- 
ima nor minima. The situation complicates 
considerably when we go over to the van der 
Waals isotherms (7.3.7). W T e will now inve- 
stigate such isotherms. The curve specified 
by (7.3.7) depends on three parameters: the 
coefficients a and b in the van der Waals equa- 
tion of state (these coefficients are character- 
istics of the gas being investigated) and the 
temperature of the gas, T. Later we will see- 
that the qualitative behavior of isotherm 
(7.3.7) is determined by the ratio of T to the 
fraction alb. 

Let us use (7.3.7) and find how p depends 
on v, or the function p = p (v). The answer is 


RT a_ 

v — b v 2 * 


(7.3.8) 


From this it follows that 

dp RT , 2a 

P dv (v — b) 2 v 3 

_ 1 [~ 2a (v — b) 2 , 

~~ (v — b) 2 L v s 


RT 


]• 


(7.3.9) 


in view of which the possible maxima and 
minima of p = p (v) (for a given temperature 
T) are determined from the condition 


2a (v—b)* 


■RT = 0, 


(7.3.10) 


which is equivalent to p ' — 0. Unfortunately, 
however, (7.3.10) is a cubic equation in v , 
and the solution of this equation is not that 
simple. 

Not being able to solve Eq. (7.3.7) di- 
rectly, we will study it qualitatively. Consid- 
er the function / (z;) = 2a (v — b) 2 /v* — RT 
(which depends on the same three parameters a, 
b, and T). We will try to establish how it va- 
ries, that is, find its maxima and minima. To 
this end we find the derivative of /: 

df _ 2d [ 2(v-b) 3 (v — b) 2 ~| ■ 

dv L y 3 i > 4 J 

2a (v fr) ^ rj 2 

From the physics of the problem (see 
Eq. (7.3.7)) it follows that v is always greater 
than b (the absolute temperature T cannot be 
negative), whereby (7.3.11) may vanish only 
at v = 3b. Further, for b C v < 3b the right- 
hand side of (7.3.11) is negative, which im- 
plies that v = 3b corresponds to the maximum 
of / (v), the greatest possible value, equal to 
8a/27b — RT. Now we must consider three 
different cases, corresponding to three tem- 
perature intervals. 
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(a) 8a/27b — RT < 0, that is T > 8a/ 
27bR, or T/(a/b)> 8/27 R. In this case / (v) < 0 
for all v and this means that p' is negative for 
nil values of v, whereby the function p = p ( v ) is 
constantly decreasing as v grows (the upper two 
van der Waals isotherms for carbon dioxide 
in Figure 7.3.8; the temperature at each curve 
is given in degrees Celsius). The respective 
•curves of p vs. v resemble hyperbolas, and the 
asymptotes are the straight lines p = 0 and 
v = b. 

(b) 8a/27b — RT > 0, that is T < 8a/ 
27bR, or T/(a/b) >> 8/27 R (see the lower two 
curves in Figure 7.3.8). Since at v = b the 
function / (v) is negative (it is unimportant 
that there is no such state of matter with 
v = b) and at v — 3b we have / (v) = 8a/27b — 
RT> 0, between b and 3 b there must be a 
value v 0 of v such that / (v 0 ) = 0 and, hence 
p' (vq) = 0. For this value the function p = 
p (v) is at a minimum (for smaller values of v 
the derivative p' is negative and for larger va- 
lues it is positive). This is not all, however. 
As v becomes greater and greater (v ->■ oo), 
the value of p diminishes (in view of (7.3.8)), 
and so somewhere at v = v x > 3b the function 
p = p (v) must attain its maximum. We see 
that while in case (a) the van der Waals iso- 
therms have neither maxima nor minima and 
resemble hyperbolas (ideal gas isotherms), in 
case (b) their nature changes: here the func- 
tion / ( v ) (=p) has a (local) minimum and 
a (local) maximum. The asymptotes of the p 
vs. v curves in case (b) are the same as in case 
(a). 

(c) 8a/27b — RT = 0, that is, T = 8a/ 
27 bR, or T/(a/b ) = 8/27 R. Since the greatest 
possible value of f (v) here is zero, for all other 
values of v this function is negative, which 
means that p' = f (v)/(v — b) 2 is negative, too. 
Thus, the function p = p (v) is everywhere a 
decreasing function, just as in case (a). At 
v = 3 b it has a point of inflection (since 
P'(3b) — 0). 

In the case (a), obviously, to each value of 
pressure p there corresponds a single value of 


volume v , which corresponds to the possibility 
of only one state of matter occurring at T > 
8a/27bR, and this is the gaseous state. On the 
other hand, if T < 8a/27bR, to each value of 
pressure p there correspond, as shown by Fig- 
ure 7.3.8, three different values of volume 
v, so that here matter can exist in different 
states. The state with the smallest volume 
v — y liq (he. the greatest density) is the liq- 
uid state. The state with the greatest volume 
v — vg aS (at the same temperature and pres- 
sure as the liquid state) is the gaseous state. 
Finally there is the intermediate state, which 
proves to be unstable. If the substance we are 
studying is placed in a vessel of a certain vol- 
ume v (somewhere between zqiq and z; gas ) and 
heated up to a temperature T lower than a cer- 
tain temperature T c (see below), part of the 
substance will be in the gaseous state and the 
remainder will be in the liquid state. A 
definite state with the intermediate volume v 
does not exist. 

The intermediate case with T = 8a/27bR 
plays an exceptional role: the temperature T 
in this case is called the critical temperature 
of the gas (it is usually denoted by T c ). The 
point of inflection C of the respective isotherm 
corresponds to the following values of volume 
and pressure: v = 3b (—v c ) and p = a/27b 2 
( = Pc)? these volume and pressure are also 
called critical. At T = T c , if p> p c (or v < 
v c ), the substance is in the gaseous state, 
and if p < p c (or v >> v c ), the substance is in 
the liquid state; thus, p c is the highest pres- 
sure at which the substance can exist in the 
form of saturated vapor. Note also that point C 
of the critical isotherm (each substance has 
only one critical isotherm) corresponds to 
Pc v c = a/96 = (3/8) RT C , which is only three- 
eighths of the quantity predicted by the ideal 
gas law. 

A fuller study of the van der Waals equa- 
tion, its consequences, and further modifi- 
cations can be found in textbooks on physics 
and engineering thermodynamics. 

7.4* Convex Functions and Algebraic 
Inequalities 

In Section 1.4 we defined the convexity 
of a curve y — f (x) (or a function y — 
f (#)) in the following fashion: a function 
y = f {x) is said to be convex on the 
interval a ^ x ^ b of variation of x 
(or curve y = f (x) is said to be convex 
upward, or simply convex) if any chord 
AB of this curve (A = A {x±, f (o^)) 
and B = B {, x 2 , / (# 2 ))> a ^ x i < 
x 2 ^ b) lies below the corresponding 
arc AB of the curve y = f (x) (Figure 
7.4.1a). On the other hand, if a chord 
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x = constant 



( 6 ) 


Figure 7.4.1 

AB lies above the corresponding arc AB 
of the curve y = / (.z), the function / 
is said to be concave (or the curve y = 
/ (x) is said to be convex downward , or 
simply concave) (Figure 7.4.1&). In other 
words, the function y = f (x) is convex 
on the interval a ^ x ^ b if the point 
N at which the straight line x = const- 
ant [a ^ x t < x < x 2 ^ b) intersects 
the chord AB lies below the point M 
where the same straight line intersects 
the curve representing y = f (x)\ on 
the other hand, if N lies above M, the 
function / is concave . This definition 
already relates the concept of con- 
vexity with inequalities, since if the 
points M and N have coordinates 
( x , y) and ( x , Y), the convexity of y — 
f (; x ) implies y >Y, while the con- 
cavity of y — f {x) implies y <7. 

In Section 2.7 we established a 
simple condition for the convexity 
of a function y = f (x), namely, that 
a function is convex if its second deriv- 
ative , y” = d*y!dx 2 , is negative . For 




instance, from this convexity con- 
dition it immediately follows that the 
functions y = In x, y = — x 2 , and y = 
— 1 lx (for x > 0) are convex since their 
second derivatives are respectively, 
y " = (1 lx)' = -Ux\ y" = (- 2 x y = 
—2, and y" = ( l/x 2 )' = —2 lx 3 (Fig- 
ure 7.4.2; the reader will recall that 
the function y — In x is defined only 
for x > 0). 

There exists a simple but important 

Theorem 1 If y = / (x) is a function 
that is convex on the interval from a to b 
and x 1 and x 2 are two values of the 
independent variable within the interval 
(i.e. two arbitrary numbers such that 
a x x <L x 2 6), then 

/ (^i) ~ t~/ te) ^ j ^ ^ 

Proof . In Figure 7.4.3, OA — x 1 and 
OB = x 2 . Then AM = / (xf) and BN = 
f (x 2 ). Moreover, if S is the middle of 
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Figure 7.4.3 


the line segment AB , then OS = 
(x x -f-^ 2 )/2 and, hence, SP = f( ~^~ . 

On the other hand, since the length 
of the midline SQ of the trapezoid 
ABNM is equal to one-half the sum 
of the lengths of the bases AM and2?iV, 
we can write SQ = [/ (xj) -f / (x 2 )]I2. 
But according to the definition of a 
convex function, the midpoint Q of the 
chord MN lies below point P of the 
arc MN ; hence 

f ( X 1 ) + / ( x 2 ) ^ 4 l x l~\r x 2 t 

2 \ 2 / ’ 

which is what we set out to prove. 7 * - 9 
Examples. 

(a) y = In x. The convexity of this 
function implies that 

In xi + In x 2 ^ ^ x t +x 2 
2 ^ m 2 * 

i.e. In Y x i x 2 < ln- * 1 ^* 2 , 

or. finally, 

V t (7.4.2) 

which demonstrates that the geometric 
mean of two distinct positive numbers 
is smaller than their arithmetic mean. 


7 - 9 In proving Theorem 1 (and all subse- 

quent theorems in this section), we confine 
our discussion to the case where / fa) and 
f (x 2 ) are of the same sign. We advise the rea- 
der to consider the case of opposite signs on 

his own (instead of employing the property of 
the midline of a trapezoid, the reader should 
use the following theorem: the length of the 
segment of the midline of a trapezoid lying 
between the diagonals is equal to ’one-half the 
difference of the lengths of the bases). 


(b) y — — x 2 . Reasoning in the same 
way as in (a), we get 

*L+*2 f x l + x 2\ 2 

2 ^ l 2 ) ’ 


or, in another form, 


X \ + X 2 ^ ^ X 1 +X 2 


l 


f x\+X. 
2 


2 

2 


X 1 +X 2 

2 


rp, . f a\+4+ -••+&% 

the expression 1 / - , 

which is the square root of the arith- 
metic mean of k squares of the numbers 
a 1? a 2 , . . ., a ft , is known as the root- 
mean- square of these numbers. Thus, 
the above result can be formulated as 
follows: the root-mean-square of two 
distinct positive numbers is always greater 
than the arithmetic mean of these num- 
bers. 

(c) y = — 1 lx. Theorem 1 implies 
that 

— — ( — -j- <4 L 

2 \x x x 2 I ^ fa + x 2 )/2 ’ 

nr 1 ( 1 ! 1 \ ^ 2 

2lli + *2P x l+ x 2 ’ 


or, finally, 

2 ^ X t + X 2 

lfa + ifa ^ 2 


(7.4.3) 


The quotient 1 + i l a J+V a *+ . - -+V a * 

( = i/«.+i/ n .+. ..+</«, ) ■ "’ hich ls 
simply the arithmetic mean of the 
numbers l/a l7 l/a 2 , . . ., lla k (k reci- 
procals of the positive numbers a x , a 2 , 
. a h ), is called the harmonic mean 
of the numbers a x , a 2 , . . ., Thus, 
the inequality (7.4.3) states that the 
harmonic mean of two distinct positive 
numbers is smaller than the arithmetic 
mean of these numbers. 

Theorem 1 can be generalized as 
follows: 

Theorem 2 If a function y = / (x) is 
convex in the interval from a to b 
and x 1 and x 2 are two values of the 
independent variable within the interval 
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Figure 7.4.4 

{a ^ x 1 <T x 2 ^ b) and p and q are two 
arbitrary positive numbers whose sum is 
equal to unity , then 

pf fa) + Qf M < f (px i + qx 2 ). 

(7.4.4) 

(For p = q = 1/2 this theorem trans- 
forms into Theorem 1.) 

Proof. First of all, we note that if M 
and N are two points with coordinates 
(x x , y ± ) and (x 2 , y 2 ) and Q is the point 
that divides segment MN in the ratio 
MQ -f- QN = q -T- p, with p + ? = 1> 
then the coordinates of Q are (px x + qx 2 , 
pi/i + g*/ 2 ). Indeed, if we denote by 
X u X 2 , and X and Y x , Y 2 , and Y the 
projections of points M, N, and Q 
on the x and y axes (Figure 7.4.4), then 
points X and Y divide the segments 
X 1 X 2 and Y 1 Y 2 in the ratio q -r- p. 
This yields 710 

OX = OX x + X x X 
= Xl + q {x 2 — x x ) 

= (1 — q) x x + qx 2 = px 1 + qx 2 

and 

oy = or 2 + y 2 y 

= y2 + p ( 2/1 — ^ 2 ) 

= (1 — p) 2/2 + P0i = py 1 + qy 2. 

We return now to the graph of our 
convex function y = f (x) (Figure 7.4.5). 
Let us assume that OA = x ly OB = x 2 , 

7 . 10 in Figure 7.4.4 we depicted the case 
where all four numbers x x , x 2 , y x , and y 2 are 
positive. The reader is advised to consider the 
alternative cases himself. 



AM = f (x x ), and BN = f (x 2 ). Accord- 
ing to what we have just proved, the 
coordinates of the point Q that divides 
MN in the ratio MQ QN = q -f- p 
are (px 1 + qx 2 , pf (x^ + qf (z 2 )); thus, 
the length of SQ in Figure 7.4.5 is equal 
to pf (x x ) + qf (x 2 ) and that of SP is 
/ (px i + qx 2 ). But since y = / (x) is 
a convex function, point Q lies below 
point P, which means that pf (x x ) + 
qf (^ 2 ) < / (px 1 + qx 2 ), which is what 
we set out to prove. 711 
Examples . 

(a) y = In x. In this case (7.4.4) 
yields 

p In x x + q In x 2 C In (px x + qx 2 ), 
whence 

x\x\ < px 1 + qx 2 , p > 0, q > 0, 

p + q = i- 

(b) y = — x 2 . We have 
— px\ — qx\ < — (paq + ^ 2 ) 2 , 
or p^ + ^ 2 >(P^i + ^ 2 ) 2 » 
or ]/ r p^ + ^ 2 >P^ 1 + ^ 2 > 

where p > 0, q > 0, and p + </ = 1. 

(c) y = — 1/#. Here we have 

7 < 1 

*1 * 2 ^ P^i + 7^2 ' 

P 7 ^ 1 

a:i £ 2 P^i + ^ 2 ’ 


7 * n As the reader can easily see, the coordi- 
nates of any point belonging to the segment 
MN can be represented in the form (px ± + 
qx 2 , py x + qy 2 ), where the numbers p 
and q (which are different for each such point) 
are such that p >* 0, q^> 0, and p + q — 1. 
Thus, the inequality (7.4.4) states that the 
entire chord MN lies below the curve y = 
/ (x), which is equivalent to the function 
being convex. 
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Figure 7.4.6 

where x x > 0, x 2 > 0, p > 0, q > 0, 
and p + q = 1. 

Theorem 1 can also be generalized as 
follows: 

Theorem 3 If y = f (x) is a function 
convex on the interval from a to b and 
#i, x 2 , . . x k is a set of k values of the 
independent variable within this interval 
such that some are distinct , then 

f (Sl) + / (^2)+ « ♦ • +/ ( x k) 

k 

< / ( *!+*«+:•_+ £ * ) (7. 4 . 5 ) 

(a particular case of Jensen's inequal - 
ity 712 ). For k = 2 Theorem 3 trans- 
forms into Theorem 1. 

Proof, We start by defining a con- 
cept that is often used in geometrical 
and analytical problems. Suppose that 
M 1 M 2 M 3 . . ,M k is an arbitrary /c-gon 
(Figure 7.4.6a), Q 2 is the midpoint of 


7 - 12 Johan Ludvig William Valdemar 
Jensen (1859-1925), a Danish analyst, algeb- 
raist, and engineer. 


side M 1 M 2 of this k-gon (M 1 Q 2 -f- 
Q 2 M 2 = 1/2 -T- 1/2), Q 3 is the point 
that divides M 3 Q 2 in the ratio 2 -r- 1 
(M 3 Q 3 -f- Q 3 Q 2 = 2/3 ~ 1/3), Q, is the 
point that divides M^Q 3 in the ratio 
3 -7- 1 (M 3 Q 4 -s- Q,Q 3 = 3/4 -f- 1/4), . . 

. . ., and, finally, Q k is the point 
that divides MkQk^ in the ratio 
(& — 1) -M, that is, M k Q k ~ Q k Q k ^ = 
(ft - l)/& -r- 1/ft. 

Point is known as the centroid 
(the center of mass) of the k-gon M X M 2 , 

. . ,M k . In the case of a triangle, 
M 1 M 2 M 3 (Figure 7.4.66) the centroid 
Q 3 coincides with the point of intersection 
of the medians of the triangle; indeed, 
in this case Q 2 is the middle of side 
M x M 2 , the segment M 3 Q 2 is a median, 
and the point Q 3 that divides M 3 Q 2 
in the ratio M 3 Q 3 -f- Q 3 Q 2 — 2 -f- 1 is 
the point of intersection of the medians 
of the triangle. 

Let us prove that if the coordinates 
of the vertices M x , M 2 , . . ., M k of a 
k-gon are fo, y x ), (x z , y 2 ), . . (x h , y h ), 
then the coordinates of the centroid Q k 
are (fo + x 2 • + . . . x h )/k, {y 1 + y 2 + 

. . . + y h )/ky 13 

Indeed, by the assumption we made 
at the beginning of the proof of Theorem 
2, the points Q 2 , Q 3 , (? 4 , . . ., Q k have 
the following coordinates: 

Vz V 2 ’ 2 — ; ’ 

n l 2 x i+ x 2 , 1 2 yi+y 2 . 1 \ 

l 3 2 "i" 3 31 3 2 + 3 y3 ) f 

n ( X 1 + X 2 + ^3 ^l + 2/2+J/3\ . 

0r l 3 ’ 3 1 ’ 

n /3 tj +*2+^3 1 1 

^4 [ 4 3 “T 4 X 4 ; 

3 yi+yn+y 3 , 1 „ \ 

T 3 h T J/4 / > 


7 * 13 This implies, for one, that the centroid 
of a &-gon is determined solely by the &-gon 
and does not depend on the order in which 
the vertices of the k-gon are numbered (con- 
trary to what we may assume from the defini- 
tion of a centroid). In the case of a triangle 
this also follows from the fact that the centro- 
id of a triangle coincides with the point of 
intersection of the medians. 
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or 

n / zi+*2+z3+*4 yi+y 2 +y 3 +yi 

l 4 ’ 4 


T=1 f T Xh ' 

k — 1 2/1 + 2 / 2 + • • • -\-yk-i , 1 „ \ 
k k — 1 A: ’ 


or 

j x i +#2+ . . . +^fe-i +#fe 

l A: 

yi+^2+ • • »+yfe-i+y/i \ 

A / ’ 

Let us now return to the convex 
function y = f (. x ). Suppose that M x , 
M 2 , . • ., M h are k sequential points 
on the graph of this function taken 
within the interval from a to b (Figure 
7.4.7). Due to the convexity of the 
function, the fc-gon . .M h is 

convex and lies entirely under the 
curve y = / (x). If the abscissas of the 
points Mu M 2 , . . ., M k are x 2 , . . 

x k , then their ordinates are, obviously, 
/ Ui> / (x 2 ), • • ., / (x k ). Hence, the 
coordinates of the centroid Q of the /c-gon 

M 1 M i . . .M h are ( 3 . ,+ + • • • +*fe , 

/(xi) + / (»,)+••■+/(«>) ) with the result 

that 


Os — 

o/^) / (si)+/ (^2)+ ♦ • • H~/ (^fe) 


and 


SP = f( 




) 


(see Figure 7.4.7). But the centroid 
of a convex fc-gon lies inside the k-gon 
(this follows from the very definition 
of a centroid); hence, point Q lies below 
point P and 

/(Sl)+/(*2)+---+/(*fe) 

k 

<Cf[ + j 

which is what we set out to prove. 

This line of reasoning remains valid 
when some (but not all) points M x , 
M 2 , . . ., M k coincide (not all of the* 
numbers x x , x 2 , . . ., x k are distinct) 
and the fc-gon degenerates into a poly- 
gon with a smaller number of vertices. 
Examples. 

(a) y = In x. From Theorem 3 it 
follows that 


In xx + ln #2+ . . . +ln x k 
k 

, x 1 + x 2 + ...+oc k 
^ k 

or Y x^x 2 . . .x h < ^ 

that is, the geometric mean ofk positive 
numbers some of which are distinct is 
smaller than the respective arithmetic 
mean . This is known as the theorem 
on geometric and arithmetic means . 

(b) y = — x 2 . In this case we get 

x i+ x l+ •••+*! 

k 

^ ^ X-j X 2 -}-Xk j 2 

or 


k 

( ^1+^2+ • . . -\~ x h 

\ k 


x\ -f- #1 “h • • • ~h x \ 

y k 


xj+x 2 + ■ 11 +£fe 
^ k 


r 


that is, the root-mean-square of k positive 
numbers some of which are distinct is 
greater than the respective arithmetic 
mean. 

(c) y = — l/;r. In this case Theorem 3 
yields 



k 


x i + x 2 +-..+** 1 
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that is, 

1 / J_ . J_ + , J_\ 

> * 

X 1 + X 2 + • • • + x k 9 

whence 

^ x l~\~ x 2~\~ • • • ~f~ X k 

1/^1 + 1/^2+ •• -+l/ x h ^ k ^ 

that is, the harmonic mean of k positive 
numbers some of which are distinct is 
smaller than the respective arithmetic 
mean. 

Finally, we will prove a theorem that 
generalizes Theorems 2 and 3. 

Theorem 4 Suppose that y = f (x) 
is a function convex in an interval from 
a to b and x x , x 2 , . . x^ are k values 
of the independent variable some of 
which are distinct and taken within the 
interval and p x , p 2 , . . p h are k positive 
numbers whose sum is equal to unity . 
Then 

pj (*i) + p 2 f te) + • + Phi fo) 

< / {Pl x i ■+■ P2 X 2 + • • + Ph x h) (7.4.6) 

( the general case of Jensen s inequality). 

For k = 2 Theorem 4 transforms into 
rp ]cneoi‘em n z,“SVh 1 ne i *ior p x —~p 2 = . . . 
= Pk = 1 Ik it transforms into Theor- 
em 3. 

Proof. Let us once more take the 
graph of a convex function y = f (x) 
and a convex fc-gon M X M 2 . . . M k 
inscribed into this graph. The vertices 
of the /c-gon have the following coordin- 
ates: (x x , / (x ± )), (x 2 , f {x 2 )), . . 

(x k , f (x k )) (Figure 7.4.8). Suppose that 
Q 2 is a point of the side M 1 M 2 of this 
fc-gon such that M X Q 2 -f- QzM 2 = 



P 2 


Pi 


, . , , Q 3 is a point 

P 1 + P 2 P 1 + P 2 F 

of the line segment M 3 Q 2 such that 


M 3 Q 3 * Q 3 Q 2 - 


Pa 


P 1 + P 2 


PI+P 2 +P 3 * P1+P2 + P3’ 


Q k is a point of the line segment M k Q z 
such that 


M ^Q ,& = Pl+P2 ^ 3+P4 

_i_ Pi + P 2 + P 3 

P1 + P2 + P3 + P4 ’ * ' 

finally, Q k = Q is a point of the seg- 
ment MkQk-i such that M k Q ~ QQh-i 

— Pk -T- (Pi +P2 + • • • + Pk- 1) (ifPi = 
p 2 = . . . = p k = 1/ft, then () is the 
centroid of the fc-gon M 1 M 2 . . . M k ). 
Using the proposition from which we 
started the proof of Theorem 2, we can 
find the coordinates of the points 

Q 2? Q 3? Q 4 » • • •» Q : 


r\ ( Pl x l + P 2 X 2 
V* 1 P 1 + P 2 

/ P 1 + P 2 


3 Vpi + P2 + P3 


Pi/(*i) + p 2 f (* 2 ) 
P 1 + P 2 
Pi x i~\~P 2 x 2 


Pi + P 2 
P 1 + P 2 


4- 




Ps 


^3? 


P 1 + P 2 + P 3 
Pit { X 1 ) + P 2 / (« 2 > 


P 1 +P 2 + P 3 P 1 + P 2 

pi-rP2-rP3 / 


or 


<?3 ( pi + p 2 + p3 (Pl*l + * 2 ** 4 - P 3 * 3 ) , 

Pl+P 2 +P 3 1 /»i/ (*1) + Ptf (*2) + p 3 / (®s)]); 


0 / Pl^l + P 2 X 2 + ♦ ♦ • + Pk- l x k-l + Pk x k 
\ P 1 + P 2 + • • • +Pfe -1 + Pfc 

Pit ( X i)+P 2 f fa)+ • » • +Pfe-i/ ( x k-i ) J rPk 1 

Pi + P 2 + ♦ • • +PA -1 + PA. ) 

or, in another form, 

Q (Pl X l + P2 X 2 + • • • + Pk x ki 

Pit ( X l) + Pzf i x l) + ---+Pkf ( x k)) , 

since Pi + p 2 + .. .+p k =l. Thus, in 
Figure 7.4.8, 

SQ — p if ( x i) + ^ 2 / ( x 2) + • • • + Pkf i x k) 1 
OS ■-= P& + p 2 x 2 + . . . + p k x k , 

SP = f (Pi x i + P 2 x 2 J r ...+P/A). 
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Since point Q lies below point P (the 
entire k-g on M 1 M 2 . . ,M k lies under 
the curve y = / (x), and Q is an interior 
point of this P-gon), we have 

Pif ( x \ ) + Pif ( x z) i~ • • • + Pfe/ (^fe) 

</ (Pi^i *+- P2 X 2 + • • • + Pk X k) > 

which is what we set out to prove. 7 * 14 
Examples. 

(a) y — In x. In this case we get 

Pi In x* 4- p 2 In x 2 J r - • * + Pk In x k 

< In (p 1 x i + p 2 x 2 -f . . . + Ph x k ) » 

whence, taking antilogs, we have 

x\'xl* ...x^ k < p i X i + p 2 X 2 + . . . +Ph x k, 

where p u p 2 , . . p k > 0 and p x + 
Pz + • • • + Pk = 1- (This is the gener- 
alized theorem on geometric and arith- 
metic means.) 

(b) y = — x 2 . We have 
— PiX\ — p 2 x\ — ... — p k X I 

< (Pi x l + Pz x 2 + • • • + Pk x k) 2 i 
or 

VPi x \ + P2 x t+ ■ ■ • +Pk x l 
> Pi x l + Pz x 2 + • • • + Ph x k> 

where p t , p 2 , . . ., p h > 0 and + 
P 2 + • • • + Pk = 1- (This is the gen- 
eralized theorem on the root-mean-square 
and the arithmetic mean.) 

(c) y = — i/x. Here Theorem 4 yields 

_Pl P2__ Ph 

Xc^ Xfa 


Pl X l 4 " P2 X 2 Pk X k 9 

7 • 14 The reader can easily see that the 
coordinates of any interior point of the /c-gon 
M 1 M 2 . . .M h can be represented in the form 

(Plfl + P 2*2 + • • • + Ph x k 1 Plf( x l) + 

Pzf ( x 2 ) + • • • + Pkf ( x h))< where p u p 2 , 
. • Pk are positive and p ± + p 2 . . . + 
Pk — 1- Thus, inequality (7.4.6) expresses 
the fact that a polygon inscribed in the graph 
of a convex function always lies below that 
graph. 


whence we can easily obtain 

1 

Pl/ X l~\~ P 2 / x 2~\~ • • • -\-Pk! x k 
< Pl x l + P2 X 2 + • • • + Pk x ki 

where Pi , p 2 , Pk ^0 and p t + 

P2+ •■■+Pk = I- 


Exercises 

7.4.1. Prove that the following functions 
are convex: 

(a) y — —x n for n > 1 and x >> 0, (b) 
y = x m for 0 < m < 1 and x > 0, (c) y = 
— l/x k for k > 0 and x > 0, (d) y = — x log x 
for x > 0 , (e) y = — x log x — (1 — x) log 

(1 — x) for 0 < x < 1 (the base in (d) and (e) 
is any number greater than unity). 

7.4.2. Write out the inequalities (7.4.1), 
(7.4.4), (7.4.5), and (7.4.6), where the func- 
tion / (x) is (a) the function of Exercise 7.4.1a, 
(b) the function of Exercise 7.4.1b, (c) the 
function of Exercise 7.4.1c, (d) the function 
of Exercise 7. 4. Id. 

7.4.3. (a) Prove that the geometric mean 
of two (positive) numbers is the geometric 
mean of their arithmetic and harmonic means, 
(b) Derive inequality (7.4.3) using the result 
of (a) and inequality (7.4.2). 


7.5 Computing Areas 

In Chapter 3 we showed that the value 

b 

of a definite integral ^ f(x) dx yields 

a 

the area of a figure bounded from above 
by the curve y = f (x), from below by 
the x axis, and on the sides by the 
vertical lines x = a and x — 6, or the 
vertical bases of the trapezoid (see 
Figure 7.5.1; for the sake of simplicity 
we assume that / {x) > 0 and a <C b). 
Thus, being able to evaluate definite 



16-0946 
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Figure 7.5.2 


integrals enables us to use standard 
techniques in computing various areas, 
whereas elementary mathematics only 
allows for calculating the areas of 
rectilinear figures (polygons) and also 
of the circle and some of its parts 
(segment, sector). 

Let us find the area of a figure bounded 
from above by the curve of the power 
function y = c x n (c > 0 and n > 0), 
from below by the x axis, and on the 
right by the vertical line x = x 0 (we 
assume that x 0 > 0; in Figure 7.5.2, 
n — 2 and c = 0.25): 



cx n dx 


CX n +l 
rc + 1 


x 0 

0 


rc + 1 


(7.5.1) 


Let us rewrite formula (7.5.1) as 
n + i cx >°' 

or, since ex™ = y (x 0 ) = z/ 0 , 

S — w _| pj" V o^o* (7.5.2) 

Since the quantities y and x have the 
dimensions of length, from (7.5.2) it 
follows that S is indeed measured in 
units of area (the square of the unit 
of length). (We see that the area S 
is, as to order of magnitude, y 0 x 0 , and 
differs from this product solely in the 
factor 1 l(n + 1), which, as to order 
of magnitude, is close to unity for n 
not too large (compare with (5.6.6), 
where y 0 = / max , x 0 = b — a, and 
l/(n + 1) is a dimensionless factor of 
the order of unity). 

In the next example, we find they 
are bounded from above by the exponen- 
tial curve 


y = ce“*/ a , e> 0, a > 0, 



from below by the x axis, and on the 
left and right by the straight lines 
x = x 0 and x = A (here A > x 0 ; Fig- 
ure 7.5.3). This area is 

A 

S A — j ce~ x / a dx= —cae~ x t a \x 0 

x 0 

= ca {e-*« ,a — e- Ala ). (7.5.4) 

If A is great compared with £ 0 , then 

e-xja e A/a t n w iH b e seen f rom 

(7.5.4) that increasing A hardly at 
all changes S A . As A increases without 
bound, the value of e A ! a approaches 
zero without bound. And so we can 
speak of the area of the figure in Fig- 
ure 7.5.3 as being unbounded on the 
right: the unlimited figure obtained 
from the figure in Figure 7.5.3 as 
A oo has a finite area 

oo 

Soo ~ j ce~ x f a dx — cae' x »l a — ay 0 , (7.5.5) 

3C 0 

where y 0 = y (# 0 ) = ce~ x ^ a . 

In formula (7.5.3), the exponent 
must be a dimensionless number, which 
means that the dimensions of a must 
coincide with those of x, that is, a 
has the dimensions of length. The 
dimensions of y and S are those of 
length and area respectively. 

It turns out that the area under one 
arch of a sine curve (Figure 7.5.4) is 



(7.5.3) 


Figure 7.5.4 
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which means that the area under one 
arch of a cycloid is three times the area 
of the circle that generates the cycloid. 

The results (7.5.2) and (7.5.5)-(7.5.7) can 
also be formulated in a different manner. 


Figure 7.5.5 

expressed very simply. Indeed, this 
area, bounded from below by the line 
segment of the x axis from 0 to n, is 

si 

*5=^ sin xdx = — cosx 1^ — 2. (7.5.6) 

o 

It is not difficult to find the area 
under one arch of a cycloid (Figure 
7.5.5; see also Section 1.8). If the curve 
is given parametrically, x = cp (t) and 
y = (7) (see Section 1.8), the for- 

mula for the area of a curvilinear 
trapezoid assumes the form 

b p 

S= ^ydx = \y(t)-§ dt, 

a a 

where a — x (a) and b = x (P) are the 
endpoints of the segment considered. At 
x — R(t — sin t) and y = i?(l — cos t), 
with a = 0 and P =2n, we have 

2n 

*5= j R (1 — cos t) R (1 — cos t) dt 

0 

2Jt 

= i? 2 ^ (1 — cos t) 2 dt 
o 

2jt 

= jR 2 j (1 — 2 cos t + cos 2 1) dt 
o 

2jx 2jt 


The area of the hatched curvilinear triangle 
OAM in Figure 7.5.2 is (n + i)~ l OA -AM — 
OA [(n + l)- 1 AM], that is, coincides with 
the area of the rectangle with base GA and 
altitude ( n + i)~ x AM. In view of this it is 
sometimes said that the effective altitude of 
our curvilinear traingle (or the curve y = cx n 
within the limits x = 0 to x — x 0 ) is one- 
(n + l)th of the real altitude AM. Here the 
effective altitude is understood to be the alti- 
tude of the rectangle with the same base as that 
of the curvilinear triangle and the same area. 
Similarly, the effective altitudes of the curvi- 
linear trapezoids depicted in Figures 7.5.4 
and 7.5.5 with bases n and 2R ji, respectively, 
are 2 /ji (^0.6366) and 1.57?, while the real 
altitudes, or the values of y max , are 1 and 2 7?, 
respectively. Finally, the effective length of 
the infinite figure depicted in Figure 7.5.3, 
a figure bounded by the exponential curve 
y = ce~ x l a , the x axis, and the straight line 
x — x 0 , is equal to a, since the area of this 
figure is that of the rectangle with base a and 
altitude y 0 . 

The concept of effective length (or effec- 
tive altitude) of a curve often proves to be use- 
ful. For instance, it can be shown that the 
(infinitely long) bell-shaped curve in Figu- 
re 7.5.6, y — c_exp (—x 2 /a 2 ) restricts a finite 
area S — caY in other words, the effective 
width of this curve, whose altitude is y maX = 
y (0) = c, is equal to aY n ~ 1.77a (see [15], 
Section 3.2). 

Let us determine the area S of an 
ellipse with semiaxes a and b (Fig- 
ure 7.5.7). Of course, this area can easily 
be found by methods of elementary 
geometry (see the text below printed 
in small type); however, we prefer using 
a more standard method (although in 
the given case a more cumbersome one) 


= R 2 j dt — 27? 2 ^ cos t dt 
o o 

2jx 

+ R 2 j . l+ gg. s _ 2 * dt=R 2 t If 

0 

— 2 R 2 sin t ~ {t ->r ~ sin 2 

= i ? 2 X 2 n — 2 Z? 2 x 0 + 4 x 2 it 

7?2 

+ 4x0 = 3ni? 2 , ( 7.5.7) 



16 * 


Figure 7.5.6 
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Figure 7.5.7 


involving integral calculus. Note that 
by virtue of symmetry it suffices to 
find the area iS^of that portion which 
lies in the first quadrant and then 
multiply it by four: S = 4 S x . To com- 
pute S x we find y from the equation of 
the ellipse x 2 /a 2 + y 2 !b 2 = 1 (see Sec- 
tion 1.7): y = ( bla ) Y a 2 — x 2 (here the 
square root is understood to be the 
positive root, since in the first quadrant 
y is positive). Thus, 


a 

S i =— ^ ]/* a 2 — x 2 dx . (7.5.8) 

o 


The integral (7.5.8) can easily be found 
by making the change of variable 
x — a sin t. This yields 

a Jt/2 

\y a 2 — x 2 dx = [ a Y l —sin 2 1 a cost dt 
h o 


jt/2 n/2 

---- j a 2 cos 2 tdt = a 2 1 ^~^ QS — 
o o 

9 r t . sin2n^/2 na 2 

= + — Jo “T- 


dt 

(7.5.9) 


Using this, from (7.5.8) we get 
c b na 2 nab 

The area of the entire ellipse is S — 
nab. If a — b = r, then we have S = 
nr 2 (the area of a circle) in complete 
accord with the fact that for a — b — r 
an ellipse becomes a circle. 


An ellipse with semiaxes a and b (a^> b) 
is obtained from a circle of radius a by shrink- 
ing the latter to the x axis with a ratio of 
k = bla (see Figure 7.5.7). It is easy to see 
that such shrinking transforms a figure F 
of area s into a figure F' of area s' = ks. In- 
deed, a grid of small squares with side 6 and 
area 6 2 (the sides of the squares being parallel 
to the coordinate axes) is transformed by the 
shrinking transformation into a grid of rec- 
tangles with sides 6 and kb and area k6 2 each. 
But a grid of equal squares (a measuring grid) 
is used to measure areas: if a figure F is covered 
by N squares of the grid, its area s is approx- 
imately N6 2 . On the other hand, the figure F', 
obtained through shrinking figure F in the 
above-described manner, will be covered by 
N rectangles of the transformed grid; there- 
fore, s' ~ iV/c6 2 , whereby s' = ks (since 6 
can be chosen as small as desired, so that the 
approximate equalities s ~ N6 2 and s' ~ 
Nkb 2 can be assumed as precise as desired). 
And since the area of a circle of radius a is 
equal to na 2 , the area of the ellipse with semi- 
axes a and b (the ellipse being obtained by 
shrinking the circle with a ratio k = bla) is 
kna 2 = (bla) na 2 = nab. 


Note an important circumstance. In 
Chapter 3 we already pointed out that 
the area (the integral) can be either 
positive or negative. This calls for a 
certain amount of care when finding 
areas. Suppose we want to know the 
amount of paint needed to paint an 
area bounded by two arches of a sine 
curve, from x = 0 to x = 2ji, and the 
x axis (see Figure 3.5.2) if unit area 
requires a grams of paint. As was shown 
above, one integral cannot be used to 
compute the entire area. We have to 
take separate integrals over the inter- 
vals from x = 0 to x = n and from 
x = Jt to x — 2 ft. 

Generally, if the integrand y = f (x) 
changes sign , then to solve the problem 
in paint consumption, so to say, we 
must split the interval of integration 
into parts in which / {x) preserves sign, 
then evaluate the integral over the 
separate parts, and finally sum the 
absolute values of the resulting in- 
tegrals. 


Exercises 

7.5.1. Find the area of a figure bounded by 
the x axis and a single arch of (a) y = sin 2 x 
and (b) y — cos 2 x. [Hint. Draw the graphs 
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of both functions and take advantage of the 
formulas sin 2 x = 1/2 — (1/2) cos 2x and 
cos 2 x — 1 — sin 2 x ] 

7.5.2. Find the area of a figure bounded 
from above by the curve y = x (1 — x) and 
from below by the x axis. 

7.5.3. Find the areas into which the para- 
bola y = (1/2) x 2 divides the circle x 2 + y 2 = 
8. 

7.5.4. Find the amount of paint needed 
to cover the area of a figure bounded by (a) 
the curve y = x/(i + x 2 ), the x axis, and the 
vertical lines x — 1 and x = — 1 , and (b) 
the curve y = x 3 + 2x 2 — x — 2 and the x 
axis. [Hint. First construct the graph of the 
function y — x 3 + 2x 2 — x — 2.) 

7.5.5. Find the area of the ellipse x 2 /25 + 
y 2 / 4 = 1. 


7.6* Estimating Sums and Products 

In Sections 3.1 and 3.2 we introduced 
the concept of a (definite) integral as 
the limit of a sequence of certain sums. 
By calculating these sums we can estim- 
ate the integral, or the area of the 
appropriate curvilinear trapezoid (see 
the method of rectangles and the trapezoid 
method described in Section 3.1). But 
the relationship that exists between 
integrals (or areas) and (finite) sums 
can be exploited in the opposite direc- 
tion, for estimating sums by means of 
integrals. This simple idea (see also 
[15], Section 1.2) will be illustrated 
by several instructive examples. 

We start with the sum of powers of 
natural numbers : 

SW = l* + 2 fe + 3 ft + ...-j- n k , (7.6.1) 



with 0 > k > — 1 as an exercise. 715 
The area s of the curvilinear triangle 
OA n M n depicted in Figure 7.6.1 is, 
obviously, 



x h+i n 

k + 1 0 


nk+i 

k + 1 * 


(7.6.2) 


On the other hand, the method of 
rectangles yields 

i k + 2 h + ... + n h >s 

and 0* + 1* + . . . + (n — l) fe < s, 

that is, 

l h + 2 h + . . . + n h 


nh+i 
> z+l 


>1* + 2*+... + (ji-1)*. 


where k > — 1. Let us consider the 
function y — x h \ Figure 7.6.1 depicts 
the case with k positive, while the 
reader is advised to consider the case 



Figure 7.6 J 


or 


72& +1 

k + 1 


<s ( y 


= lft + 2 ft +... + n ft <^- r + re fe . 

(7.6.3) 

This implies that for large values of n 
we have 


S (k) = l k - 4 - 2 h - 




72* +1 

k+ 1 ’ 


(7.6.4) 


7 * 15 For k < 0 we must consider not the 
curvilinear triangle OA n M n but the curvilin- 
ear trapezoid A A n M n M similar to the one 
depicted in Figure 7.6.2. 
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in the sense that for n^> 1 the ratio 
S^l yj - j differs but little from unity, 


since 

< 1 + (7.6.3a) 


(these inequalities are obtained if we 
divide all members in (7.6.3) by 
n k+1 /(k + 1)). 


From (7.6.3) follows a more precise esti- 
mate of the sum sty than (7.6.4): 

S) l ft) = lfc + 2'‘+...+«fc~ —L-rcH+i + crefe, 

rC — p 1 

with 0 < c < 1. Indeed, it can be proved that 
for k > 0 we have 

+ (7.6.4a) 

in the sense that for large values of n the 

ratio [s ( n ft ) — (i^~[ nk+1 + Y nh ) ]/” ft is 
very small. 

The behavior of the sum 

-l+ff- -!+... + 1 

(7.6.5) 

for large values of n is quite different. 
The graph of the function y = ilx, to 
which we naturally turn, is the hyper- 
bola (Figure 7.6.2). The area of the 
curvilinear trapezoid AA n M n M bound- 
ed by the x axis, the hyperbola y = 
llx, and the straight lines x = 1 and 
x — n, is given by the following for- 
mula: 

n 

<r- j -^ = lntc I? = lnra, (7.6.6) 

1 


that is (compare with (7.6.5)), a <Z 
S n — 1 In and cr > S n — 1, whence 

a + — = In n + — < S n 
n 1 n n 

<lnn+l — cr-fl. (7.6.8) 

Thus, 

^n — 1 + y + yH - •••+“ = In + Yn» 

(7.6.9) 

where the number y n lies within the 
interval 

Y<Vn<l* (7.6.9a) 

Since, as it can easily be seen, y n 
is the difference between the area of 
the square AA 2 M' 2 M 1 (equal to unity) 
and the sum of the areas hatched in 
Figure 7.6.2, it monotonically decreases 
as n oo and tends to a limit that 
can be estimated if we find (approxim- 
ately) the hatched area in Figure 7.6.2 
for a fixed value of n that is not too 
large; this limit, y ~ 0.577, is known 
as Euler's constant . Thus, for n 3> 1, 
we have the following approximate for- 
mula: 

5 n =i+Y+-|+ ...+Y— inre +v 

~ lnra-f 0.577. (7.6.10) 

The next example is the sum 

P.6.11) 

The experience we have gained in the 
process of studying the sums (7.6.1) and 
(7.6.5) tells us that in this case it is 
expedient to start by estimating the 
area t of the curvilinear triangle bound- 
ed by the curve y = (In x)/x, the axis 
of abscissas, and the straight line 
x = n (Figure 7.6.3); we compare t 


since In 1 = 0. On the other hand, 
by the method of rectangles (see Fig- 
ure 7.6.2), 

i 

1 + T + T+ + 




_i 

2 


(7.6.7) 


Figure 7.6.3 
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with the sum T n . The area % is given 
by the formula 


n n 

t = ^ dx— ^ In x d (In x) 
1 

In n 

-l 


u du = 'Y' 


ln ”,-4(lnn)* (7.6.12) 
0 ^ 


{here we have introduced a new vari- 
able, u = In x). 

Next we use a modified version of 
the method of rectangles. The area r 
is replaced with the sum of areas of 
n — 1 rectangles with bases A 1 A 2i 
A 2 A 3 , . . A n ~\A n (see Figure 7.6.3), 
while the altitude of rectangle with the 
base Ai^Ai is in one case the greatest 
value of the function y = (In x)/x 
attained in the interval i — 1 ^ x ^ i 
and in the other is the smallest 
value of the same function attained 
in the same interval. All this requires 
a detailed study of the curve y = 
{In x)lx. Obviously, y (1) = 0 and y ->0 
as x —>• oo (cf. Section 6.5); let us 
find the maximum of the function 
<y \x) . 'Since tiie equation 

y' (*)= (i^ ^+ ln *(+) , 

1 In x 1 — In a: ^ 

S* “ X 2 ~ U 

has a unique solution, x = e (In x = 1 
only at x = e), and y' {x) >> 0 for 
x < e and y' (x) <0 for x > e, the 
graph of the function y = (In x) lx has 
the shape depicted in Figure 7.6.3. 7 - 16 
At x = e the function has its only 
maximum (y max — e ~ x — 1/2.7), from 
x = 1 to x = e the function grows, 
and then decreases monotonically. 
Thus, the maximum of the function 
y {x) on the interval from 2 to 3 is 
equal to 1/e, while on all other inter- 
vals from i — 1 to i , with i an integer 


7 - 16 The fact that the value x — e at which 
y'(x) = 0 corresponds precisely to the maxi- 
mum of y = (In x)/x can easily be established 
without analyzing the sign of y' (x) for differ- 
ent values of x , that is, simply by common- 
sense reasoning (compare with what was said 
at the end of Example 2 in Section 7.1). 


no less than two, the maximum coin- 
cides with the value of the function 
at one of the endpoints of the specified 
interval. We arrive at the following 
estimate for the area t: 

^ In 2 , 1 , In 3 , In 4 

T< — + T + — + — 

, , In (n — 1) / „ rp , 1 In n \ 

+ n — 1 l In ' jr e n ) ’ 

and 




In 1 
1 


that is, 


In 3 \ 

3 J ’ 


In 4 
4 



In n 
n 


<T n < T 


...+ 


In n 

n 


In 3 
3 ’ 


Finally we get (see (7.6.12)) 


y ( 1 nre) 2 -^+~<r„ 

<|(l nre )2 + i|i; (7.6.13) 


in other words, T n cannot differ con- 
siderably from (1/2) (In ra) 2 : for all va- 
huwo nfx m hfu^ 

T n =~ (In rc) 2 + 8n> 

where, in any case, — 0.368 — e _1 < 
6 n < (In 3)/3 ~ 0.367. Here, since the 
difference between t and its lower 
bound determined by the method of 
rectangles, that is, 

'-('.-T 1 ) (-■¥-•■)• 

can only increase as n grows, (In 3)/3 — 
6 n monotonically grows as n -*■ oo, but 
always remains smaller than (In 3)/3 — 
e -1 ~ 0.735. From this it follows that 
(In 3)/3 — 6„ tends to a certain limit 
as n— *■ oo, which means that 8 n also 
tends to a certain limit as n -> oo (the 
limit lies between — 0.37 and +0.37). 

This method can be applied for 
estimating the value of n\ = 1-2-3. . . 
n. We write the number In (re!) = 
In (1-2-3. . .re) as 

In (re!) = In 1 + In 2 + In 3 
+ . . . + In re. (7.6.14) 
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Figure 7.6.4 


To estimate the sum on the right-hand 
side of (7.6.14), we consider the cur- 
vilinear triangle AA n M n bounded by 
the x axis, the curve y = In x, and the 
straight line x = n (n is a natural 
number greater than unity) (Figure 
7.6.4). The area o n of this triangle is, 
obviously, 

n n 

o n — \ In x dx — x In x |? — ^ x d (In x) 
l i 

n n 

= x\nx |i — j x^- = x\nx If — l| dx 
i i 

= (x\nx — x) \\ = n\\in — n -\- 1 

(see Section 5.4). If we inscribe in our 
curvilinear triangle A 1 A n M n the right 
triangle A 1 A 2 M 2 and n — 2 trape- 
zoids A 2 A 3 M 3 M 2 , A 3 A±M 4 M 3 , . . . 

. . ., An^AnMnMn^ (here A x == 
AjA 2 , . . ., A n are points on the x axis 
with abscissas 1, 2, . . ., n , and 
M 2 , . . ., M n are the corresponding 
points on the curve y = In x), we 
will find that a n is greater than 
the sum of the areas of the right 
triangle and the trapezoids, the sum 
being equal to 

s n = y In 2 + ~ (In 2 + In 3) 

+ y (In 3+ In 4) 

4- . . .+4 [In (re — 1)4- In re] 

= la 2 + In 3+ . . .+ In (n — 1) + j In n 


= (In 1 + In 2 + In 3 + . . . + In n) 

— yin n = In (n\) —y In n. (7.6.15a) 

On the other hand, the area cr n of the 
curvilinear triangle is smaller than 
the sum S n of the areas of n — 2 trap- 
ezoids with midlines A t M u i = 2, 3, 
. . ., n — 1, and altitudes of unit length 
(the altitudes are the line segments 
with endpoints ( i — 1/2, 0) and 
(i + 1/2, 0), with i running through 
the same values; the upper lateral side 
is a segment of the line tangent to the 
curve y — In x at point M { ) plus the 
area of the small trapezoid with the 
altitude ^4^42/2 of length 1/2 and the 
midline A 5/a M 5/4c (here A 5/4 = (5/4, 0) 
and M 5 / 4 is the point on the curve 
y — In x corresponding to A 5/4 ) plus the 
area of the rectangle A n _ 1/2 A n M n Mn- U2 
(here A n _ 1/2 = (n — 1/2, 0) and 
Mn -i /2 = (n — 1/2, In n)). It is clear 
that 

S n = -iln4 + Mn2 
+ 1- In 3+ . . . + 1 • In (n — 1) f y In re 
= In 2 + In 3+ . . . + In (n — 1) 

+ In n In 4 — y In re 
= In (re!) — y In re + -j In 4 . 

(7.6.15b) 

Since s n < o n ^ S n , we can write 

4 

In ( n\ ) — — 

< In ( n \ ) — ~ In n -f -j In , 
or 

In (re!) >relnre + - 2 -l n re — re+1 



and In (re!) < re In re + y In re — re+1. 
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figure bounded from above by the curve y = 



Figure 7.6.5 

that is 

( 1 — In — ) + In ( n n ]/ n e~ n ) < In (ft!) 

< 1 + In ( n n Y n e ~ n ) . 

In other words, 

j//" -jt en n y n e~ n < n\ < £ft n ]/ n e~ n . 

(7.6.16) 

This formula shows that for large n 
the value of ft! is close to C ]/ nn n e~ n , 
where the number C lies between e ~ 
2.72 and ^475 e 2.43. Here, from 
the fact that the difference a n — - s n 
increases monotonically as ft oo, we 
can easily derive, as we did earlier, 
that n\iy rin^e' 71 tends to a certain 
limit as n oo (this limit lies between 
Y and e, that is, between 2.43 
and 2.72). It can be proved that this 
limit is equal to Y 2n ~ 2.507. Thus, 
we arrive at Stirling’s formula 1 17 

n\ Y 2nn n n e~ u , (7.6.17) 

in which the approximate equality 
means that, as ft — >• oo, the ratio 
n\iy 2nnn n e~ n tends to unity (and for 
n large it is very close to unity). 

Stirling’s formula (7.6.17) can be substan- 
tiated in another manner. Let us evaluate the 
integral 

oo 

/„ = ^x n e-*dx, (7.6.18) 

0 

where n is a positive integer. The integral I n 
can be thought of as the area of an (unlimited) 

7 - 17 James Stirling (1692-1770), a Scottish 

mathematician. 


x n e~ x and from below by the x axis for x > 0’ 
(Figure 7.6.5; n = 4). 7 * - 18 To evaluate the 
integral, we use integration by parts setting 
e~ x dx = dg and x n — /, which means that 
g — —e~ x and df = nx n ~ l dx. Thus, we ob- 
tain 

oo oo 

^ x n e~ x dx = ( — x n e~ x ) | J° + J nx n ^~ 1 e~ x dx. 

0 0 

In Section 6.5 it was established that 
x n e~ x = x n /e x 0 as x — oo . Since x n e~ x = O' 
for;r = 0, it follows that — x n e~ x | J° = 0; hence,. 

oo oo 

^ x n e~ x dx = n j x n ~ 1 e~ x dx y Le. I n = nl n - ±.. 

0 o 

Applying integration by parts to 7 n _i, 
we likewise get I n - 1 = (n — 1 ) I n - 2 , and so- 
forth. For this reason 

I n = n (n — l)(u — 2) . . . 3-2-1 -/ 0 , 

oo 

where I 0 = j* e~ x dx = — e~ x | = — 0 + 1 = 1.. 

0 

Thus, 

oo 

1 n = j x n e~ x dx 

0 

= n (n — 1) (n — 2) . . . 3-2- 1 = nL (7.6.19> 

For large values of n we can arrive at a 
good approximation for I n if we find the va- 
lue x = ^ ma x at which the integrand x n e~ X: 
attains its maximum value. This value is. 
found from the condition 

( x n e~ x )' = nx n ~ 1 e~ x -f- x n ( — e~ x ) 

= (n — x) x n ~ 1 e~ x = 0, 

where £ maX — 11 ■ The magnitude of this max- 
imum, obviously, is y maX = (x n e~ x ) max = 
n n e~ n . If we write our function in the form 
y = e H x ), where / (x) = n In x — x r and ex- 
pand / (x) in a Taylor’s series (6.1.18) near the* 
value x = £ ma x (=w), we can estimate the 
effective width of our (unlimited) figure, that 
is, the width of a rectangle whose height is. 
j/max and whose area is that of the figure con- 
sidered, or I n . This width proves to be ap- 
proximately equal to j/" 2nn (see [15], Sections. 
3.2 and 3.3). He nce, we arrive at Stirling’s- 
formula: n\ ~ Y 2ji n n n e~ n for n > 1. 

7 - 18 Note that for any (arbitrarily large) 
value of n the quantity y = x n e~ x tends ta 
zero as x -*• oo (see Section 6.5; this fact will 
be used later); if this was not so, we could not 
speak of an area equal to I n . 
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7.7* More on Natural Logarithms 

Let us examine the problem of natural 
logarithms . In Chapter 3, the integra- 
tion of power functions led us to the 
following results: 

[x n dx = — x n+l -f C for n^= — 1, 

(7.7.1) 

j^L = lnz + C, (7.7.2) 

where, of course, the constant C may be 
any number and not the same in 
Eqs. (7.7.1) and (7.7.2). 

But why does the case (7.7.2) differ 
-•so strongly from (7.7.1)? What is the 
reason for this? At first glance there 
is a contradiction here. Imagine a 
sequence of power functions, or curves, 
y = £ -0 - 9 , y = £' 0 - 99 , y = ar\ y = 
^r*" 1 - 01 , and y = x* 11 . All these curves 
lie close to each other, and the curve 
'with n = —1 in no way differs quali- 
tatively from the adjacent curves, for 
instance, from the curves with n = 
— 0.99 and n = — 1.01. Hence, the 
►area under the curves y = x n (within 
fixed limits a < x < b) must be a 
.smooth function of n , that is, it must 
•change smoothly as we pass the value 
n = —1. But this area is expressed by 
a definite integral, whose form at 
n = — 1 is quite different from that at 
n = — 0.99 and n = — 1.01. 

To resolve this seeming contradiction, 
we must show that for n close to — 1 
the integral in (7.7.1) is close to the 
logarithmic expression in (7.7.2). Let 
us carry out the following calculations: 

b 

S= f — = In- for n= — 1 
j x a 

a 

and a, 6>0; (7.7.3) 

6 

S = j dx-=^- (b z — a 8 ) 

a 

for ra = — 1 + e. (7.7.4) 

If the exponent n is close to — 1, then 
z is small and the expression (7.7.4) for 
S assumes a form that is inconvenient 


for calculations. Any number raised 
to the zeroth power is equal to unity, 
which means that if we substitute 
e = 0 into (7.7.4), we will have 6 8 — 
a? = 1 — 1 =0. Thus, at 8 = 0, both 
the numerator and the denominator in 
(7.7.4) vanish. 

In this lies the answer to a question 
often posed by students that compare 
the indefinite integrals (7.7.1) and 
(7.7.2): the right-hand side of (7.7.1) 
contains n -f 1 in the denominator, 
and at first glance it seems to tend to 
infinity asn-> —1, while the logarithm 
is finite. But this infinity in (7.7.1) is 
fictitious, since it disappears when we 
pass to a definite integral and, hence, 
it can be eliminated by appropriately 
selecting the constant of integration C 
(e.g. C = — (rc + l) -1 = — l/e). 

How does one go about simplifying 
the expression for an area that incor- 
porates a 8 and b e ? How should we 
calculate powers with small expon- 
ents? Let us write the following identi- 
ties: 

a = e lna , a e = e 8lna . 

We use the main property of the num- 
ber e , that is, that for small r<l we 
have e r ~\-\-r (to within terms of 
the second and higher orders of small- 
ness, that is, quantities of the order of 
r 2 , r 3 , etc.). If n is close to — 1, that 
is, n = — 1 + e, where the term 8 
is extremely small, we can apply the 
formulas we have just written: 

£ = — (b s — a E ) =— (e elnb — e« lna ) 

[(1 + e In b) — (1 + 8 In a)] 

= In b — In a = ln-^-. (7.7.5) 

This resolves the contradiction. In 
the limit of small 8 the quantity 8 has 
canceled out, and we have found that 
formula (7.7.1) leads to the same result 
as formula (7.7.2). The area under the 
curve y == x n proves to be a smooth 
function of the exponent n even when 
n passes through the “singular” value 
n — — 1 . 
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We know how to calculate small 
powers x z for 8<1. From school 
mathematics we know that x° — 1. Now 
we know how much x £ differs from 
unity if e is close to zero but still 
differs somewhat from zero. 

We can approach the integral J — 

x - 1 dx from still another angle. Imag- 
ine a person that knows nothing about 
logarithms. However, we assume that 
he or she knows what integral calculus 
is and, confronted with an integral, 
would like to study its properties. 
Thus, let us consider the integral 

b 

J(a,b)= Jf-. (7.7.6) 

a 

A remarkable property of integral 
(7.7.6) is that the value of J (a, b) does 
not change if a and b are multiplied by 
a quantity k ( the same for a and b). As for 
the area under the curve y = x' 1 — 
llx, we can say that if we replace the 
limits of integration a and b in (7.7.6) 
with ka and kb the length of the strip 
whose area we are calculating will 
increase fe-fold, but simultaneously its 
length will decrease fc-fold (here we 
.assume that k> 1; see Figure 7.7.1). 



(b) 

Figure 7.7.1 


The area of the strip does not change 
in the process. 

This result can easily be obtained 
without applying geometrical methods. 
Let us introduce the variable z = kx , 
z = zlk, dx = (1 Ik) dz. For the new 
variable the limits of integration will 
change, too; namely, x — a becomes 
z — ka and x = b becomes z = kb. 
We have 

A 

b kb Jl. dz kb 

a ka z ka 

kb 

= j ^ = J(ka,kb). 

ka 

(At the end of the chain of formulas we 
employed the fact that the variable of 
integration is a dummy variable and 
can be denoted by any letter.) 

If the function of two variables 

J (a, b) does not change when we mul- 
tiply both a and b by a constant k , 
then this function depends only on the 
ratio of the two variables: 

J (a, b) = F (t), where t = fo/a. (7.7.7) 

Indeed, suppose that /( 1, t) = F(t). 
Then, since J(a , b) = J(ka, kb) for 
all values of k , we put k = 1/a and 
obtain 

= /( 1, t)=F(t). 

But we know that J is an integral. 
Using the general properties of an 
integral, we obtain an important proper- 
ty of F (£). Let us set b = ac m , where 
m is an integer, and split the domain 
of integration into m parts. Then J (a, b) 
is equal to a sum of integrals: 

J (a, b) = J (a, ac) -f J ( ac , ac 2 ) 

+ / (ac 2 , ac 3 ) + . . . +/ (ac m_1 , b). 

(7.7.8) 

But all the integrals on the right-hand 
side are the same, since for every one 
of them the ratio of the upper limit 
of integration to the lower limit is c 
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(we recall that b = ac m ). Thus, 

J {a, b) = J (a, ac m ) = F (A) =F(c m ), 

J (a, ac) = J ( ac . ac 2 ) = . . . == F (c), 
and, hence, 

F {c m ) = mF (c). (7.7.9) 

We see that the function F (t), which 

t 

is defined as the integral j dxlx with 

i 

the lower limit of integration equal to 
unity, possesses the following remark- 
able property: when the independent 
variable is raised to the mth power 
(i.e. c ->■ c m ), the function is multiplied 
by m. But this is the main property of 
the logarithm! If we denote the ratio 
F (< d)lF (c) (where c is fixed and d can 
vary) by / (d), then 


the logarithmic function) can be defined 
by means of an integral! 

In school we often encounter func- 
tions that are specified by means of algo- 
rithms, or rules for calculating their 
values. What does the notation / (x) = 
x 3 mean? Here we have the rule: multi- 
ply x by x and then by x once more, and 
you get / (x). Starting with this rule, 
we can study the properties of function 
/ (#); for instance, it is clear that if x 
is increased by a factor of two, the 
values of the function increase by a 
factor of eight. But a function may be 
specified by an integral, and its prop- 
erties can be studied even if we do not 
know how to find its values directly 
(i.e. not by integration but by means of 
various algebraic transformations). 7 - 20 
For instance, the function 


that is, if d = c m , then / (d) = m\ in 
other words, / (d) is the power to 
which we must raise the constant fixed 
number c (the base) to obtain the 
number d. This means simply that 
/ (d) is the logarithm of number d (the 
reader will recall the common definition 
of a logarithm): 7 - 19 

/ (d) = log c d. 

Is there any need to consider the 
simple formula \ x~ x dx = In + C in 
such detail? For many calculations 


0 


(7.7.10) 


plays an important role in the theory 
of probability, while the function 


t — constant X ( ■ ■ (7.7.11) 

J Y*l-x 2 


in the theory of vibrations specifies the- 
dependence of time on the position of 
a vibrating particle. Luckily, the in- 
tegral on the right-hand side of (7.7.11) 
can be evaluated in finite form. 


this is unnecessary, and in the majority 
of textbooks this is not done. Of course, 
it is nice to know how to deal with the 
problem when the exponent in the 
integral is close to unity but is not 
exactly unity and what the error in- 
troduced is when we substitute — 1 for 
— 1 + b; the first part of this section 
was devoted to these aspects. But 
besides the purely technical aspects, 
there is an aspect of principal import- 
ance here (and the remaining part of 
the section is devoted to this), namely, 
that a function (in the case at hand, 

7 * 19 See A. I. Markushevich Areas and 
Logarithms , Mir Publishers, Moscow, 1981. 


and this dependence can be reversed, 
that is, we can pass from the function 
t = t (x) to the function x = x (t)~ The 
result is x — x 0 sin tot (with co = 1 lk)» 
However, in more complicated cases 
in the theory of vibrations, we arrive 
at the integral 

t=[ — = — (7.7.12) 

7 - 20 Note that at the beginning of the 
20th century the German mathematician 
F. Klein suggested introducing in school math- 
ematics the logarithm by means of the for- 
ce 

mula J x~ x dx = In a. 

1 



7.8 Average, or Mean , Values 


253 


This integral cannot be evaluated di- 
rectly, that is, it cannot be expressed in 
terms of simple functions, but its prop- 
erties can be studied and tables of 
its values can be compiled. 

Such a generalization of the concept 
of a function and a study of the prop- 
erties of the function specified by the 

integral ^ x~ x dx require of the reader 

the involved line of reasoning given 
above. 

Exercise 

7.7.1. Derive from the definition (7.7.6) 
and (7.7.7) of F (t) the main property of this 
function: F (ab) — F (a) + F ( b ) for all posi- 
tive a and b. 

7.8 Average, or Mean, Values 

The concept of an integral can be used 
to give an exact definition of the mean , 
or average , of a quantity that is a func- 
tion of another quantity. 

If we have a quantity that assumes 
a set of several values, say the m values 
#i, v 2 , v 3 , . . ., v m , then the natural 
way of defining the mean v of this 
quantity is to write 

~ __ V\ H~ V2 + v s H~ • • • H~ 
m 

But how should we define the mean value 
of a function v (£) of a variable t that 
assumes all values in the interval from 
a to b (a <Z t <Z b)? 

Let us assume that v (t) is the instant- 
aneous velocity. How should we define 
the value v (a, b ), that is, the mean 
velocity over the time interval from a 
to 6? The average, or mean, velocity 
is defined as the ratio of the distance 
traveled to the time that has elapsed: 

b 

J v ( t ) dt 

This definition of the mean is mean- 
ingful when the function is not only the 
rate of motion but any other function. 
For instance, suppose that y = y (x) 



Figure 7.8.1 

is the graph of a function in the xy- 
plane depicted in Figure 7.8.1. Then 

b 

y (, x ) dx is the area under the curve. 

a 

The formula 

b 

$ y ( x ) dx 

y ( a ’ *>)= \- a - » 

b 

or (6 — a) y ~ j y (x) dx, (7.8.1a) 

a 

means that y is the height of a rect- 
angle with base b — a whose area is 
equal to the area under the curve. 7 - 21 
This means that the hatched area with 
the sign above the straight line 

y = y (Figure 7.8.1) is exactly equal 
to the hatched area with the sign “ — ” 
under the straight line y — y. The 
graph of the function y = y {x) (if it 
is not a straight line parallel to the 
x axis) must lie partly above the line 
y = y and partly below, where y is the 
mean value determined by (7.8.1a). 
Hence y is greater than the smallest 
value of y (x) and smaller than the 
greatest value of y {x) on the interval 
a < x < b. 

Let us consider the following exam- 
ples. 

7 - 21 In Section 7.5 we called the quantity 
y (for y (x)> 0 on the interval a < x < b) 
the effective altitude, or height, of the cur- 

b 

vilinear trapezoid with area S = £ y dx. 

a 
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Figure 7.8.2 


Suppose that y (x) is the linear func- 
tion y = kx + m. Then the integral 
I ( a , 6), which is the area of the trape- 
zoid in Figure 7.8.2 with altitude b — a, 
bases y (a) and y ( b ), and midline 



I {a, b) = y{a) + y - {b) ( b—a ) 



Figure 7.8.3 


decelerated motion , that is, when a body 
moves under a constant force, say, the 
force of gravity, when v = gt + v 0 . 
In calculating the distance traveled 
we use the properties of the mean of the 
linear function: 


= y!±*(b-a). (7.8.2) 

This formula can easily be obtained 
by resorting to geometrical methods: 

b 

I (a, b) = j (kx-\-m) dx — {^--^mx^ | b 

a 

= ifi +«»_(-+«) 

with y ( b ) — kb-\-m, y (a) = ka + m, and 
y = k (a + 6)/2+ from which 

readily follows (7.8.2). Thus, the linear 
function, 

- = ±£<5L = „(!±i), (7.8.2a) 

that is, for the linear function the mean 
value over a given interval is the arith- 
metic mean of the values y (a) and y {b) 
at the endpoints of the interval . Here is 
another way of stating this fact: the 
mean value of the linear function over 
an interval is equal to the value of the 
function at the midpoint of the interval. 

An important example of the linear 
function is the time dependence of the 
velocity in uniformly accelerated or 


z(a, » = (>-.)[ ] 

-(6-«)(i5±K+a c ). 

The reader must bear in mind, how- 
ever, that for another, that is, non- 
linear, dependence, the formula (7.8.2a) 
becomes invalid. 

Let us consider the example of the 
quadratic function (parabola) y — rx 2 + 
px + q. Suppose, for the sake of defin- 
iteness, that r is positive, and let us 
consider an arch of the parabola, say 
with a <Z x <Z b. Figure 7.8.3 then 
shows that 7 22 

/ a + b \ ^ y(a)+y(b) 
y \ 2 / ^ 2 

We now turn to the integral 

b 

y (x) dx, that is, the area under the 

parabola. This area is smaller than 
that of the trapezoid with bases A 0 A 
and BqB. On the other hand, if we 


7 ‘ 22 See Section 7.4. We recall that the 
parabola y = rx 2 , r > 0, is convex downward, 
and the parabola y = rx 2 + px + q with 
arbitrary p and q is derived from the parabola 
y = rx 2 via parallel translation (see Sec- 
tion 1.4). 
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draw through point C , y ) 

the tangent to the parabola, it inter- 
sects the vertical lines x = a and x=b 
at points A' and B' and completes 
the trapezoid A 0 B 0 B'A' with midline 

C 0 C = y (~y-j ; the area of this tra- 
pezoid is obviously smaller than the 
area under the curve. Thus, in the 
case of a parabola with r > 0 we have 

b 

(b — a) y (^) < j y (*) dx 

a 

< (6 — a) . 

Correspondingly, we have the following 
inequalities for the mean value y over 
the interval from a to 6: 

*(*¥)<>< . 

For the quadratic function there is 
also the exact formula, known as Simp- 
son s rule 7 23 (we give it without deriv- 
ation; see Exercise 7.8.2), which is valid 
for both signs of r: 

~y=h( a ~¥)+i [ J ^ L ] 
w+jy (nr) +4^>- 

(7.8.3) 

This formula provides a good approxim- 
ation scheme for calculating the area 
under any smooth curve (see Exercise 
7.8.46). 

Here are two more simple facts per- 
taining to mean values. 

1. The mean value of a constant over 
any interval is that constant. This is 
clear physically: if the instantaneous 
velocity does not change, then the 
mean, or average, velocity over an 
interval is equal to the constant value 
of the instantaneous velocity. 

7 - 23 Thomas Simpson (1710-1761), an Eng- 
lish algebraist, analyst, geometer, and probabil- 
ist. Simpson’s rule (7.8.3) is valid even for 
a cubic function y = Ax s + Bx 2 + Cx +f D 
(for arbitrary A, B, C, and D\ if A = 0, we 
have the case of a quadratic function; see 
Exercise 7.8.2c). 


This is also very simply obtained 
from formula (7.8.1a): 

b 


C (a, 6) 


j C dx 

a 

b — a 


C (b a,) q 

b — a 


2. The mean value of the sum of two > 
functions is equal to the sum of the mean-, 
values of the summands : 

2/i + J/2 — £/l + #2- 
Indeed, 

b 

y^ + y 2 = b~r a I [yi W + yz dx 

a 

b 

= J»i ^ dx 

a 

b 

+ j V* (x)dx = y i + y 2 . 

a 

Here are some additional examples^ 
Let us find the mean value of the func- 
tion y = sin x on the interval from, 
x — 0 to x = n: 

it 

j sin x dx 

^ (0. «> = i 4r=ro- = -I- 0.637. 

The mean value of the function y = 
sin x on the interval between x = 0* 
and x = 6 is 

b 

J sin x dx 

l(0-i>)- L T=<r= ± =r : - b - P.8.4). 

What will happen if we increase the 
number 6 without bound, that is, if 
we increase without bound the interval? 
In (7.8.4) the numerator does not 
exceed two for arbitrary 6 (it is equal 
to 2 if cos 6 = — 1, that is, for 6 = ax, 
3n, 5 jt, 7n, etc.). The denominator in 
(7.8.4) will increase without bound, 
and so we can say that over an infinitely j 
long interval (we arrive at such an 
interval if we send 6 to oo) the mean 
value of the sine is zero (namely, the 
larger the interval, the closer to zero 
is the mean value of sin x). By applying; 
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a similar method, we can show that 
the mean value of the function y — 
■cos x over an infinite interval is also 
•equal to zero. Indeed, 


y ( 0 , b) = 


b 

j cos x dx 
0 


6 — 0 


sin x | § _ sin b 
b ~ ~b~ ’ 

(7.8.5) 


.and if we increase the number b without 
hound, the denominator of (7.8.5) will 
increase without bound and the numer- 
ator will not exceed unity. Hence, the 
entire fraction tends to zero: y (0, oo) = 
0. Literally in the same way we find 
that the mean value of the function 
y = cos kx is also equal to zero over an 
infinitely long interval. 

Let us find the mean value of the 
function y = sin 2 x over the infinite 
interval from x = 0 to x = oo. Using 
a familiar formula of trigonometry, 
.sin 2 x — (1/2) (1 — cos 2x ), we find that 


-v- n — 11 0 1 A 1 

sin 2 ^=-y— y cos2;r= y — 0=y. 

If we now take advantage of the formula 
.sin 2 x + cos 2 x = 1, we get the mean 
-value of cos 2 x over the same interval: 


cos 2 x = 1 — sin 2 x = 1 — 1/2 = 1/2. 

Often it is convenient to use means 
instead of integrals. In essence, these 
quantities are equivalent: knowing the 

b 

integral / = j y dx , we can easily find 

a 

the mean y = H{b — a), while having 
calculated the mean y we can easily 
find I = (b — a) y. 

The convenience of the mean lies 
in the fact that y is a quantity with 
the same dimensions as those of y 
and, obviously, is of the same order 
of magnitude as y is on the interval 
under investigation (compare with what 
was said in connection with formula 
(5.6.6). Therefore, it is more difficult 
to miss an error that is ten times the 
walue of y than to miss the same error 
in the value of the integral. 


It is usually assumed that the student 
studying higher mathematics has mas- 
tered arithmetics and algebra and would 
never allow an error of ten times the 
quantity determined or an error in the 
sign. Experience has shown, however, 
that is not the case. For this reason, 
calculations must always be carried 
out in such a manner so as to reduce 
the probability of an undetected error. 

We note also that the definition 
(7.8.1) of the mean v of the function 
v ( t ) is closer, perhaps, to the concept 
of the weighted mean or weighted aver- 
age than to the concept of the mean 
v = m~ x (z>i + i>2 + • • • + v m) of a set 
of numbers understood as the arithmetic 
mean. Suppose that we have to deter- 
mine the average mass p of one stone 
in a pile of stones (or the average price 
of a book in a bookstore). Let us first 
find the total mass of the stones (or the 
total price of the books) by setting up 
the sum P = p 1 n 1 + p 2 n 2 + . . . + 
p h n k (respectively, Z = z ^ + z 2 n 2 
+ . . . + z k n k ), where n ± is the 
number of “like” stones (or books), 
that is, stones with the same mass p t 
(or books with the same price z t ). Then 
we divided P (or Z) by the total number 
of stones in the heap (or books in the 
store). The result is 

~ __ Pl^l4~P2^2~E - • • 4 ~Pk n k 
p N 


or 



z i n i J r z 2 n 2 ~{- • • '~\~ z k n k 
N 


We do the same every time we 
wish to find the average of the 
quantities «/i, • • •» y n when these 

quantities are separated into k groups 
of “like” objects, with w l9 n 2 , . . ., n h 
objects in each group (that is, n x objects 
in the first group, n 2 in the second, and 
so on; obviously, + n 2 + • • • + 
^ - N ). 

Let us now return to our example 
of a continuously changing velocity v. 
Suppose that we know the law v = 
v (x) by which the (instantaneous) vel- 
ocity v can be found at each point on 
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the path of travel. We wish to find the 
total time T it took an object to travel 
with velocity v (x) from point x — a 
to point x = 6. The time At it takes 
the object to travel a small distance A# 
will obviously be equal to the quotient 
Ax/v , where v is the velocity over Ax , 
which can be assumed constant since 
it has no time to change appreciably 
over Ax. (The more exact way of stating 
this is to write At ~ Ax/v, since even 
over a small distance Ax the velocity 
changes somewhat). The total time 
of travel will be approximately expres- 
sed by the sum 

(7.8.6) 

y l v 2 V k V ' 

where Ax t — x t — x t _ x (here i — 1, 2, 

. . ., k, and x 0 = a , x x , x 2 , . . ., x k = b 
are points that divide the entire distance 
from a to 6 into k small intervals), and 
where v t can be taken as being either 
v (xi) or v (x t _ i) (cf. Section 3.1). The 
same value of T is provided by the 
following integral: 

*-]■%!• < 7 - 8 - 6a > 

a 

which is the limit of the weighted mean 
(7.8.6) in which the various Ax t are 
taken with different weights l/v t ; the 
mean value of l/v is given by the ratio 
of (7.8.6a) to the entire distance trav- 
eled, 6 — a. 

There are several aspects that are 
related to the fact that in the formulas 
of integral calculus we often employ 
weighted means and that must not be 
ignored. Suppose we have a chemical 
reaction whose rate F = F (0), that is, 
the quantity of substance entering the 
reaction per unit itme, depends on the 
temperature 0 at which the reaction 
occurs; the temperature 0 changes in 
the course of the reaction, or 0 = 0 (t), 
where t is time. The average (or mean) 
temperature of the reaction lasting 
from time t — t x to time t = t 2 will, 
obviously, be 
* 2 

6=-^^- J 0 («).*, (7.8.7) 

tl 



Figure 7.8.4 

while the average reaction rate is ex- 
pressed by the formula 

F =T^\ F(6)dt - < 7 ' 8 ' 7a > 

tl 

One must not think that the average 
reaction rate F coincides with the rate 
F (0) taken at the average reaction 
temperature, 0. For instance, suppose 
that 0 = 0 (£) is a linear function 
(Figure 7.§.4a). Then, obviously, 0 = 
(e x + 0 2 )/2 = 0( ) . As for the 

reaction rate F , we assume that it 
grows exponentially with temperature 
0, that is, F (0) = e ke . But because F 
grows so rapidly (see Figure 7.8.46; note 
that since the temperature 0 increases 
with time uniformly, F (£) is also an 
exponential function), the value of the 
integral (7.8.7a) almost entirely is de- 
termined by|the (large) values of F at the 
right end of the interval of variation of 
0 (or t) and depends but little on the 
other values of F , whereby we cannot 

write F = F(fi)=±[F (0 X ) + F (0 2 )]. 

This fact completely explains the non- 
homogeneity in the weights of F (0) 


17-0946 
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associated with different intervals A0 
of variation of 0. 


In connection with the question of mean 
values we recall two theorems that refer to 
continuous and smooth functions. 

1. Lagrange’s theorem The mean value of 
a function over an interval is equal to the value 
of the function at some point within the interval . 
In the notation of formula (7.8.1a), 

T(a,b) = f(c), a<c<b. (7.8.8) 

Let us draw a rectangle whose area is equal 
b 

to the integral j* / (x) dx, which is the area 
a 


of a curvilinear trapezoid bounded from above 
by the curve / ( x ), from below by the x axis, 
and from the sides by the straight lines x = a 
and x — b. The base of the rectangle is the 
line segment from a to b, while the height of 
the rectangle, or the effective altitude of the 
trapezoid, is equal to the mean value / (a, b) 
of the function. Obviously, the upper base of 
the rectangle is certain to intersect the arc 
of the curve y = / (x) within the limits a and 
b at a certain point (c, / ( c )) that lies some- 
where between the endpoints of the arc (Fi- 
gure 7.8.5a). Here / (x) was considered conti- 
nuous, since if / (x) is discontinuous at some 
point within the interval, the above state- 
ment may prove to be invalid (Figure 7.8.5&). 

W ithout looking at the figure, we can reason 
as follows. If M and m are the greatest 
and smallest values of f (x) on the segment a c 
x < b, that is, if M > / (x) > m, then 
b b 


M (b — a)~ J M dx^z j / (x) 


dx 


a 


a 



b 

> j* m dx=m (b — a) 
a 

and, hence, 

b 

M > ' h i _ a j / (*) dx=J(x)^i m. 
a 

But this leads us to (7.8.8), since a continuous 
function assumes on the interval from a to b 
(at a point c in this interval) a value A that is 
intermediate between the greatest and smal- 
lest values (i.e. such that M > A > m), in 
particular the value 


b 



a 


In a similar manner we can establish a 
simple generalization of the above theorem: 
if a function f (x) on the line segment a c x c 
b is continuous and another function g (x) does 
not change its sign on this interval (i.e. is either 
everywhere positive or everywhere negative), 
then 

b b 

j / ( x )g(x)dx = f(c) j g(x) dx, (7.8.9) 

a a 


where , as usual , a <. c < b. 

Indeed, suppose that, for the sake of de- 
finiteness, g (x) is nonnegative for a C x c b. 
Then, obviously, 


b b b 


f g (x) dx = f Mg (x) dx > f 

f (x) g {x) dx 

a a 

a 


b 

e 

b 


\ mg (x) dx — m 

j i ?(x)dx. 

(7.8.10) 

a 

a 



where again M and m are the greatest and smal- 
lest values of the function / (rr) on the interval 
a < x c b. Dividing both parts of (7.8.10) 
b 

by J g (x) dx > 0, we get 7 • 24 
a 


7 - 24 Clearly, if a function g (x) that does 
not change its sign on an interval a < x < b 
is not “pathological” (say, not like the func- 
tion / (x) described at the beginning of Sec- 

b 

tion J14.1), then the equality J g (x) dx = 0 

^\a 

for b a means that g (#)= 0, and in this 
case the validity of (7.8.9) is obvious. 
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b b 

M > j f(x)g (x) dx ^ g (x) dx > m. (7.8.11) 

a a 

But a continuous function / (x) assumes on the 
line segment a -C x b all the values that 
lie between the greatest and smallest values 
of the function on this interval; hence, each 
number that lies between M and m, which 
means the fraction in (7.8.11), too, is equal 
to the value / (c) at an intermediate point c 
(lying between a and b ): 


b b 

j/ (x) g ( x ) dx j g (x) dx = / ( c ). (7.8.12) 

a a 



(b) 


Figure 7.8.6 


(It is clear that (7.8.12) is equivalent to (7.8.9)). 

Example. If / (x) is an arbitrary continuous 
function and a and b have the same sign (i.e. 
both are either nonnegative or nonpositive), 
then 


From this it follows that 


$ y' (x) dx 
y'(a, &) = — 


b — a 


y(b) — y(a) 

b — a 


* f ^n+i — a n+i 

\ x n f(x)dx = f(c) J x n dx = /(c), 

a cl 

( 7 . 9 . 13 ) 

where number c lies"" between a and b. 

2. Let us take a function y = y (x) such? 
that y (a) = y (b). It is assumed that the func- 
tion is smooth. The above hypothesis implies 
that the function cannot be monotone in the 
interval from a to b. But if the function is not 
monotone, it must have a maximum or mini- 
mum somewhere between a and b. This means 
that within the interval at a certain point x = c, 
where a <C c <C b, the derivative of the function 
vanishes : y' (c) = 0. This statement is known as 
Rollers theorem . 7 25 

The geometric meaning of Rolle’s theorem 
is clear from Figure 7.8.6, where we have 
depicted two types of curves with maxima 
and minima (the second curve has both a ma- 
ximum and a minimum between a and b). 

Lagrange’s and Rolle’s theorems are in- 
terconnected: Rolle’s theorem can be obtained 
as a corollary of Lagrange’s theorem. Indeed, 
let us write the following identity that follows 
from the relationship between a derivative 
and an integral (see Section 3.3): 

b 

y(b) = y(a)+ ^ y' (x) dx. 
a 


7 - 25 Michel Rolle (1652-1719), a French 
analyst, algebraist, and geometer. 


But by the hypothesis of Rolle’s theorem, 
y (b) = y (a), whereby here y ' = 0. Now we 
only need to apply Lagrange’s theorem to 
y' 0 x ): if the mean value of a function (in our 
case the derivative of y (x)) is zero on the in- 
terval from a to b, then there is a point c 
within the interval at which the function van- 
ishes, y f (c) — 0. 


Exercises 

7.8.1. Find the mean value of the func- 
tion y = x 2 on the interval from 0 to 2. Com- 
pare this mean value with the arithmetic 
mean of the values of the function at the end- 
points of the interval and with the value in 
the middle of the interval. 

7.8.2. Verify Simpson’s rule (7.8.3) for (a) 
the function of Exercise 7.8.1, (b) the gener- 
al quadratic function y — rx 2 + px + q , and 
(c) for the cubic function y — Ax s + Bx 3 + 
Cx -f- D. 

7.8.3. The force of gravity diminishes as 
the distance from the earth’s center grows ac- 
cording to the law F = A/r 2 . Find the mean 
value of the force of gravity over a path that 
starts at the earth’s surface (i? the radius of 
the earth) and ends at a point that is R dis- 
tant from the earth’s surface (i.e. at a point 
that is 2 R distant from the earth’s center). 
[Hint. To find the mean value, use the “energy 
integral” F (x) dx , where F is the force and x 
the distance. This integral has an important 
physical meaning.] 


17 * 
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7.8.4. Compare the mean value found in 
Exercise 7.8.3. with (a) the arithmetic mean 
of the values of the force of gravity at the end- 
points of the interval (which coincides with 
the average when the quantity changes linearly 
with distance), and (b) the mean value calcu- 
lated via Simpson’s rule (7.8.3) (which coin- 
cides with the true mean value for a quadra- 
tic law of variation of the quantity with dis- 
tance). 

7.8.5. Find the average value of y = x n 
over the interval from x — 0 to x = x 0 . 

7.8.6. Find the mean value of the function 
y =i c e^ x over the interval in which y varies 
from y = n to y — m ; express this mean value 
in terms of n and m, eliminating c and k from 
the answer. Investigate the resulting expres- 
sion when m is close to n, or m = n + v, 
v < n. 

7.8.7. Find the mean values of the func- 
tions y — sin 2 x and y = cos 2 x over the fol- 
lowing intervals: (a) from x = 0 to x = n, 
and (b) from x = 0 to x — jt/4. 

7.8.8. Determine the period of the func- 
tion y = sin (co£ + a), where co and a are 
constants. Find the mean value of y 2 over one 
period of y 2 . 


7.9 Arc Length 



Figure 7.9.1 

whence 

■E--/ '+(¥?■ f 7 ' 9 ' 1 ) 

We pass to the limit in (7.9.1) as A#->0. 
The ratio Ay /Ax becomes the deriv- 
ative y' — f (x). Thus, we get 

ds =yr+jf (x)) 2 dx. 

The entire length of the arc is 

b 

s = s(a, 6)= j /! + (/' (x))*dx. (7.9.2) 


Let us pose the problem of finding the 
arc length s of a curve y — f (x) from 
the point where x = a to the point 
where x = b (Figure 7.9.1). We replace 
the length As of a small arc MN of the 
curve y = f (x) by the straight-line 
segment MN connecting M and A. 7 - 26 
(We consider only curves that have no 
discontinuities or cusps.) By the Pyth- 
agorean theorem, 

As (Ax) 2 + (Ay) 2 

= Ax j/"l + , 


7 - 26 The difference between the arc length 
and the length of the line segment (the chord) 
is of the order of | Ax\ 3 , and so can surely be 
neglected when we pass to the limit (to differ- 
entials). For instance, the length of the arc 
of a circle of radius r subtending a small angle 
A* is equal to rAt (we use radians to measure 
angles), while the length of the curve corres- 
ponding to this arc is 2 r sin (At/2) = 
2r [ Af/2 — (1/6) (At/2) 3 + ...] = rA* — 
(1/24) r (A*) 3 + . . . (see Section 6.2), so 
that the difference between the two lengths is 
(l/24)r (Ai) 3 , that is, is of the order of (As) 3 , 
where As is a quantity of the order of the 
length of the arc (or chord). 


If the curve is specified parametri- 
cally, that is, x = x (t) and y = y(t) 
(see Section 1.8), then, dividing the 
(approximate) equations 

As c^Y A# 2 + Ay 2 


by At, passing to the limit as A£ 0, 
and taking into account the fact that 


lim 

Af-0 


Ax 

At 


dx 

dt 


and 


obtain 


lim 

At-0 


A y 

At 


dy 

dt 


we 


ds , / / dx \ 2 / dy \ 2 

dt ~ V [ dt J \ dt ) ’ 


or ds = Y i x ') 2J r (y') 2 dt. Integrating 
this from t — a to £ = we get 
13 

s(a, P)= j V(x')*+(y') 2 dt, (7.9.2a) 

a 

where s (a, (3) is the length of the arc 
of the curve limited by the values a 
and P of parameter t . 

Because of the radical under the in- 
tegral sign in (7.9.2) and in (7.9.2a), 
it is rarely possible to evaluate the 
integral directly, that is, express s (a, b) 
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and s (a, |3) in terms of functions of the 
limits of integration a and b (or a and P) 
expressed explicitly. 

Note also that from the same right 
triangle 7 - 27 with sides Ax , Ay and 
MN (~As) which we used to express 
As in terms of Ax and Ay we conclude 
that 

Ax ~ As cos t|), Ay ~ As sin ty, 

where is the angle between the x 
axis and the line segment MN . As Ax ->■ 
0, obviously cp where cp is the 
angle between the x axis and the tangent 
to the curve y = / (x) at point M . Thus, 

dx—ds coscp, dy — dssincp, 

, - (7.9.2b) 

ds =■ Y dx 2 + dy 2 , 

where the fact that we have used differ- 
entials means that dx (=A;r), dy , and 
ds are small. 

Some examples follow in which the 
computations can be carried out with 
relative ease. 

1. The circumference of a circle. We 
seek the circumference of the circle 
x 2 + y 2 = R 2 or, to be precise, the 
length s of one-fourth of the circum- 
ference in the first quadrant and then 
multiply it by 4. 

From the equation of the circle we 
have 


y=VR 2 -* 2 , 



By formula (7.9.2), 

S= I V l +R^ dx = 

0 


R 

f — dx 
J V# 2 -* 2 ' 


(7.9.3) 


We introduce a new variable t thus: 
x = R sin t. Then dx = R cos t dt and 
from (7.9.3) we get (see Section 5.6) 

Jt/2 

j Rdt=^~, 

0 



Figure 7.9.2 

whence we obtain the well-known for- 
mula for the circumference C = (nRl2) X 
4 = 2nR. 

2. Catenary curve. This is a curve 
whose equation is 

y = -j (e*/«-j-e-*/a), (7.9.4) 

where a is a constant. 7 - 28 The word 
“catenary” comes from the Latin catena f 
meaning “chain”; the curve has the 
form of a freely hanging heavy, flexible, 
and inextensible cable (or chain) sus- 
pended from two fixed points (the 
density of the cable is assumed the 
same at each section of the cable). The 
graph of the catenary curve is given 
in Figure 7.9.2 (for a = 2). 

Let us find the length of arc of the 
catenary from x--0 to x = x 0 . From 

(7.9.4) we have y' = - e — — , and so 

YT+W=i/i+ 

( e x/a_j_ e-x/a) 2 e x/a _j_ e -xfa 

4 ^ 2 

and, hence 

Xo 

s= j— + e — dx =~(e x ! a —e- x ! a ) 

0 

— ( e *o/a — e~ x °l a ) . 

3. Cycloid. We know that the (para- 
metric) equations of the cycloid are of 
the form 

x = a (t — sin t), y = a (1 — cos t), 


1*0 

'o 



7 * 27 This triangle played an important role 
in the reasoning of Leibniz, who used it in 
introducing differentials. 


7 - 28 Equation (7.9.4) can also be written in 
the form y — a cosh (x/a), where by cosh t we 
have denoted the hyperbolic cosine of (see 
Section 14.4, in particular Figure 14.4.1). 
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where a is the radius of the circle gener- 
ating the cycloid (see Section 1.8). 
Whence, by (7.9.2a), 

ds = Y a 2 (1 — cos t) 2 + (a sin t ) 2 dt 



— a Y 1 — 2 cos £ + cos 2 1 + sin 2 t dt 
■- a Y 2 — 2 cos t dt — a 4 sin 2 y 
= 2a sin y- dt , 

and the length of the arch of the cycloid 
lying between the points corresponding 
to the values a and (3 of parameter t is 
P 

5 = 2a j sin -y dt — 4a ^ — cos -|- j | P 

a 

= 4 a ^COS -y COS y j . 



Since two successive “cusps” of the 
cycloid (see Figure 1.8.3) correspond 
to values 0 and 2 n of parameter t, 
the length of one arch of . the cycloid 
is equal to 8a, which is four times the 
diameter of the circle that generates 
the cycloid. 

As pointed out earlier, in most cases it is 
difficult (or even imposs ible) to integrate the 
function ]/~ 1 + (y' (x)) 2 in terms of elemen- 
tary functions due to the radical. For this 
reason, approximate formulas for computing 
arc length are of particular interest. 

Suppose that (y' (x)) 2 is small compared 
with unity: | y' (x) | < 1. Then, neglecting 
( y' (x)) 2 in (7.9.2), we get 

b 

s ~ j VI <?* = &— a. (7.9.5) 

a 

The difference b — a is the length of the 
horizontal line segment with endpoints x = a 
and x — b. Formula (7.9.5)?Jshows that if y ' 
is small in absolute value (i.e. the curve di- 
ffers little from the horizontal), then the length 
of the corresponding arc of this curve is also 
close to the length of the horizontal line seg- 
ment (Figure 7.9.3a). 

But if (y' (x)) 2 ~> 1, then in (7.9.2) we can 
ignore unity in comparison with (y' {x)) 2 . 
The result is 

b b 

[ V(y' ( x )) 2 dx= ? y' (x)dx=y(b) — y(a). 

a a 


Figure 7.9.3 

This formula shows us that in the given case 
the arc length of the curve is close to that of 
the vertical line segment with endpoints y (a) 
and y (b) (Figure 7.9.3 b). Indeed, if the deriv- 
ative y' is great, then the curve bends steep- 
ly upward and so is similar to a vertical 
straight line (for a vertical line the derivative 
is infinite). 

The formulas (7.9.5) and (7.9.6) yield 
simple approximate formulas for the arc length. 
But these are very rough approximations, 
which can be obtained without appealing to 
(7.9.2). It is easy to obtain more exact for- 
mulas. 

Let | y' (x) | < 1. Retaining two terms 
i n the binomial expansion of 1 + (y' (x)) 2 
(see Section 6.4) and discarding the other terms, 
we can write 

/i+(7w= [l+to' (z)) 2 ] 1 / 2 

=* 1 +-g-(2'' ( x )) 2 - 

Formula (7.9.2) now yields 

b 

S ~ 1 [! + Y (/(*))*] dx 

a 

b 

= {b—a) + ~ j (y'(x))*dx. 
a 

But if | y' (x) | > 1, then 

Y !+(»' W 2 =y' (*)]/" 1 + 


(7.9.6) 


1 

</ (*)) 2 ‘ 
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To the radical on the right-hand side we can 
apply the binomial theorem since (y' ( x ))~ 2 < 1. 
Retaining only two terms in the expansion, 
we get 

~ V> (*) t 1 + 2 (y> (a))* ] = y> (X)+ 2y' (x) * 

(7.9.7) 


Substituting this into (7.9.2), we get 
b 

a 

A y ' {x)dx+ T]yk 

a a 

= y(b)-y(a)+± 

a 


We now have the following approximate 
formulas: 


b 

s~ (6 — a) + — j (^' (a:)) 2 da: 
a 

if | y' (x) | <1, 

a 


(7.9.8) 


if lif'(*) I>1. 


The integrals here are simpler than the in- 
tegral in (7.9.2) and it is easier to perform the 
computations by these formulas than by (7.9.2). 
But these formulas are approximate, whereas 
<7.9.2) is exact. 

What errors result from their use? The 
first formula in (7.9.8) is best for small | y r | 
while the second is best for large | y r |. Both 
formulas yield the worst results at | y' | = 1. 
Therefore, to estimate the error here we will 
examine the worst case, which is y' (x) = 1. 
It is clear that if y r (x) = 1, then y (x) = 
x + c; the graph of this function is a straight 
line. 

By the exact formula (7.9.2), 
b 

s= j V^f-pT dx= \/~2 (b — a) ~ 1.415 (b — a) 

a 


(7.9.9) 

(this result immediately follows from the 
Pythagorean theorem or elementary trigo- 
nometry, since the straight line y = x + c 


forms an angle of 45° with the axis of abscis- 
sas). 

By the first of the formulas (7.9.8), 
b 

1 (* 3 

s~ {b— a) + ~Y J dx = — (b — d) = 1.5 (b — a). 


(7.9.10) 


The second formula in (7.9.8) yields the same 
result: s ^ 1.5 ( b — a). Comparing (7.9.9) and 

(7.9.10), we see that the maximum error in 
the approximate formulas is about 6%. 

When computing arc length, we must de- 
vide the curve into portions over which either 
| y' | cl everywhere or / > 1 everywhere. 
Then the error will at least not exceed 6%. 
And since (y' (x)) 2 assumes a value equal to 
unity only at certain points of the curve, a 
proper partition of the curve into portions 
will reduce the error below 6% (and usually 
substantially below 6%). Of course, there is 
no sense in finding the length of rectilinear 
portions (for one, portions over which y' (x) ^ 
1) by means of approximate formulas. 

Let us take a look at some examples; al- 
most always the computations are carried out 
to two decimal places. 

1. Find the arc length of the parabola 
y = x 2 between the points with abscissas 
x = 0 and x — 2. 

We find the derivative, y' = 2x. It is 
equal to 1 at x = 0.5, is greater than 1 for 
x >• 0.5 and is less than unity for x < 0.5. 
Therefore, the arc length s x corresponding to 
a variation of x from 0 to 0.5 can be found 
from the first formula in (7.9.8) and the arc 
length s 2 corresponding to a variation of x 
from 0.5 to 2, by the second formula: 

0.5 

s x ~ (0.5 — 0) + 0.5 j 4 x 2 dx 
0 


= 0.5 + 2 


(0.5) 3 

3 


~ 0.58, 


4 

J dx 

2^ 

0.5 


= 3.75 + 0.25 (In 2— In 0.5) -4.10, 


whence, the sought-for arc length is 

s = S! + S2 — 0.58+4.10 = 4.68. 

On the other hand, by (7.9.2), 

2 

s = 1 ^ l + 4z 2 dx , 

0 

whence, by making the change of variable 
2 x = z and employing formula 33 from Ap- 
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pendix 2 at the end of the book, we find the (here we have used the fact that 7? sin 30° = 
exact value of s : 7?/2). The integrand is transformed as follows: 


r i r - 

7? 

7? 

\ \-\-kx 2 dx= — \xy *kX 2 -\-\ 

Y R 2 — * 2 Y i x \ 2 

R V !-(: R ) 

4- 4- In (2* + VW + 1) ' 1 + C, 

(7.9.11) _ 1 

__ r. / * in- 1 / 2 

2 J 

, /~ ( x \ 2 L \7?/J 


and, consequently, 


s= -|-[2V r 17 + 4' ln ( 4 +l A 17 )] — 4.65. 

(7.9.12) 

(Any doubting reader can convince himself 
of the truth of the formula (7.9.11) by taking 
the derivative of the right-hand side of this 
formula.) 

The error introduced by formulas (7.9.8) 
came out to about 0.7% 

2. Find the arc length of the exponential 
curve y — e x between the points with abscis- 
sas x = 0 and x — 1. 

In this case, y' = e x , the derivativejgrows 
from 1 to e as x increases from 0 to 1. So we 
use the second formula in (7.9.8): 


(7.9.14) 

We put (, x/R ) 2 = t and expand (7.9.14) in 
a binomial series (see formula (6.4.3)) to get 


[«-m 


1/2 


= (l — 0" 1 / 2 


,1 .3 

1 + T f + T* 2 " 


35 


16 


128 


* 4 +. 


=h4(t) 2 +4(t)‘ 


Substituting (7.9.15) into (7.9.13) and inte- 
grating, we find that 



~ 2.72 — 1 — 0.56-* | J ~ 2.04. 


The exact formula for the arc length yields 
the value (see Exercises 7.9.16 and 7.9.2) 


s 


= V 1 + 6 2 — 1^2 + 



Y e 2 + l — 1 

V"e* + i+l 



Y 2—1 
VT+i 


~ 2 . 00 . 


The approximate formula introduces an error 
of 2%. 

The arc length may also be approximated 
occasionally by a series expansion of the in- 
tegrand in (7.9.2) in powers of x. By retaining 
an appropriate number of terms of the ex- 
pansion, we can obtain the arc length to any 
degree of accuracy. 

Let us consider an example. We wish to 
determine the circumference of a circle by 
seeking the arc length s of the circle corres- 
ponding to a central angle of 30°. The cir- 
cumference is C — 125. Quite obviously we 
will obtain the same kind of integral as in 
(7.9.3) but with a different upper limit: 


i 


R dx 

Y 7? 2 — 2 2 


(7.9.13) 


s _ ( A |_ 4 | 3 _i 5 

\ 2 T 6-2 3 ■ 40-2 5 ' 16-7-2 7 

128-9-2 9 +•••)' (7.9.16) 

1 1 is clear that the terms of the ser ies ( 7 . 9 . 1 6) 
rapidly decrease and so we need only a few 
terms of the series to get s. Taking one term, 
we have s ~ R/ 2, whence the circumference 
is C ~ 6R. Taking two terms, we have s ~ 
0.5217?, that is, C ~ 6.252 R. Three terms 
yield s ~ 0.5237? and C ca 6.2767?, and so on. 

We know that the circumference of a 
circle is C — 2ziR. Comparing this with the 
results we have obtained, we can approximate 
the value of the number n: 3, 3.126, 3.138, .... 
The more terms of the series (7.9.16) we 
take, the more exact the value of n we obtain. 
The value of jt to four decimal places is jx ~ 
3.1416. 


Exercises 

7.9.1. Write the arc length in the form of 
an integtal for (a) the parabola y = x 2 from 
point (0, 0) to point (1, 1), (b) the exponen- 
tial curve y = e* from the point with x = 0 
to the point with x = 1, and (c) the ellipse 
x 2 /a 2 + y 2 /b 2 = 1. 

7.9.2. Complete Exercise 7.9.16 by making 
the change of variable 1 + e 2x = z 2 in the 
integral. 

7.9.3. Use the approximate formulas (7.9.8) 
to find the arc length of the catenary curve 
between points with x = 0 and x — 2 


s = 
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(a = 1). Compare the result with exact value 
of the arc length. 

7.9.4. Find approximately the arc length 
of the hyperbola xy — — 1 between the points 
with x = 0.5 and x = 1. [Hint. In this case 
we cannot obtain an exact value because the 
integral in (7.9.2) cannot be expressed in terms 
of elementary functions.] 

7.9.5. Obtain approximate values of the 
number n by computing the arc length of a 
circle with a central angle of 45° (retain three, 
four, and five terms in the series). 


7.10 Curvature and the Osculating 
Circle 

Related to the arc length is the problem 
of determining the curvature k and the 
radius of curvature R of a curve at a 
point. For a circle the radius of cur- 
vature R is simply the circle’s radius, 
and the reciprocal of the radius is the 
curvature, k = HR. The smaller the 
radius R of a circle (i.e. the greater the 
curvature k ), the tighter the circle 
(Figure 7.10.1). When R is very large, 
on the contrary, the curvature is hardly 
noticeable, that is, the circle is 
“almost” a straight line; for instance, 
the fact that the earth’s radius is so 
large means that we do not notice the 
curvature of the earth’s surface (to 
notice it, we would have to observe the 
earth from outer space, and this is 
not common practice). 

Let us take an arbitrary curve T now. 
It is clear (Figure 7.10.2) that the curve 
is curved more on the portion PQ , where 
it sharply turns to the left, than on the 
portion MN , where its direction does 
not change so rapidly. But what is the 
exact meaning of the last statement? 
The explanation is as follows. 

Consider the angle cp between the 
tangents of the curve at the endpoints 
M and N of the arc MN. If over this 




arc the tangent rotates only in one- 
direction when we move from point m 
to point N, the angle cp shows the 
magnitude of the rotation of the tangent 
on arc MN (or the magnitude of the 
“rotation” of the curve, since the direc- 
tion of the curve is specified by the 
direction of the tangent). To determine 
the extent to which the curve is curved 
we must also know the arc length s of 
MN: if s is large, then even a slow change 
in the direction of the tangent can lead 
to a large value of cp. 

Thus, the ratio 

fc av — (7.10.1) 

specifies the average (or specific) cur- 
vature of the curve over arc MN. The 
curvature at point M of the curve is 
defined then as the limit of the average* 
curvature when the length of arc MN 
tends to zero: 


where As is the length of a (small) arc 
MM X of curve T, and A a is the incre- 
ment of angle a corresponding to this 
arc (a is the angle between the tangent 
to the curve at point M and the x axis), 
or the angle of rotation of the tangent 
to the curve (the angle between the 
tangents t and t 1 to T at points M 
and M 1 ; Figure 7.10.3). But by virtue 
of the definition of a derivative (see* 
Section 2.1), we can rewrite the last 
limit as 



Figure 7.10.1 


(7.10.2) 
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Thus, curvature k can be defined as 
the rate of rotation of the tangent to F 
as the point moves along the curve 
with a unit (linear) speed, that is, 
under conditions where point M travels 
the distance MM L = ds in time dt = 
ds. 

Now we need only recall that tan a = 
y' , so that a = arctan y' and, hence, 

da = d (arctan y') = \ 


moreover, ds — ^ 1-1- (j/) 2 ds (see Sec- 
tion 7.9). Thus, finally, 


k = 


da r y" 
ds ~ Ll+UO 2 


dx]l[Vi+(y') 2 dx] 


[ l+o/'m 2 

and, hence, 


i? = 4- 

k 


1 _ [l+UO 2 ! 3 / 2 


(7.10.3) 


(7.10.3a) 


If curve T is specified parametri- 
cally, x = x(t) and y — y ( t ) (see Sec- 
tion 1.8), then dy/dx = y’lx' and = 

~ - (see formula (4.4.9)). There- 
fore, from (7.10.3) and (7.10.3a) it 
follows that 


x f y" — x"y' 


[(*') 2 + Q /') 2 ] 3 / 2 ’ 
l _ [QeT + OO 2 ] 3 / 2 

k x , y ,, — x ,, y' 


( y ') 2 “1 3/2 

(x'Y J 


(7.10.3b) 


We note also that since a is a dimen- 
sionless quantity (we employ, as usual, 
radians for measuring angles) and s 
has the dimensions of length [Z], the 
curvature k of a curve has the dimen- 
sions [Z] -1 (say cm -1 ) and the radius of 
curvature (as any radius), the dimen- 
sions of length [Z], say cm. 

The notion of the radius of curvature 
(and hence the notion of curvature) of 
a curve may be introduced in a more 
geometrical manner. To this end we 
draw normals MQ and M X Q (perpendic- 
ular lines) to the tangents to two close- 
lying points M and M 1 on T (by Q we 
have denoted the point of intersection 
of the normals; see Figure 7.10.3). The 
angle MQM 1 between the normals is 
equal to the angle Aa between the 
tangents to points M and M x (in accord 
with a familiar theorem of geometry 
on angles with mutually perpendicular 
sides). From this we can find the dis- 
tance MQ from the curve to the point 
of intersection of the normals. 

We regard the small portion of the 
curve as an arc of a circle. A normal 
to the circle is clearly a radius, and the 
point of intersection of normals is 
clearly the center of the circle. If the 
curve was a circle of radius i?, then ds 
would be equal to R da , or da/ds = 
1 /iZ; this quantity, da/ds (the curva- 
ture of the circle), is constant for any 
portion of an arc of the circle. Now we 
select the arc of a circle in such a man- 
ner that it is as close as possible to the 
arc MM X (“hugs” the arc). The words 
“as close as possible” are understood 
as a requirement that the first three 
terms in Taylor’s expansion of the 
equation y = y (x) of the curve, 

y =y(a)+y’ (a) (x — a) + 

+ \y" (a) ( x — a) 2 

+ ^y"(a)(x-a) 3 + ... (7.10.4) 


(we assume that the point M on the 
curve corresponds to the value x = a 
of the abscissa), coincide with the first 
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three terms in Taylor’s expansion of 
the equation y — Y (x) of the circle, 

y=-Y(a) + Y' (a) (x — a) + 

+ jY" (a) ( x — a ) 2 

+ -g-y («)(*- o)*+ ... (7.10.4a) 

(see formula (6.1.18)). 7 - 29 In other words, 
we require that not only the values 
y ( a ) and Y (a) coincide but so do the 
first two derivatives of y (x) and Y (. x ) 
at point x = a: 

y(a) = Y (a), y’ (a) = Y' (a), 
y" (a) = Y" (a). (7.10.5) 

But by virtue of (7.10.3), all this guar- 
antees that at point M the curvatures 
of the curve T with equation y = y (x) 
and of the circle 2 with equation y = 
Y {x) coincide. This means that the 
radius R = MQ of circle 2 is expressed 
by formula (7.10.3a). 

The circle 2 that passes through 
point M of the curve T and is the clos- 
est to T is known as the osculating circle 
of the curve. 7 - 30 The radius R of this 
circle is the radius of curvature of the 
curve, and the center Q of circle 2 
is said to be the center of curvature of 
the curve (at point M ). Thus, the 
radius of curvature of a curve can be 
defined as the radius of the osculating 
circle; the curvature of the curve is the 
reciprocal of the radius of the osculat- 
ing circle. 

We note also that since by virtue of 
(7.10.4), (7.10.4a), and (7.10.5), the 
difference Y (x) — y (x) = (1/6) X 
[ Y" {a) — y’" (a)] dx + . . . changes 


7 - 29 A circle in a plane is defined by three 
numbers (parameters); these may be the coeffi- 
cients a, 6, and r in the equation of the circle 
( x — a) 2 + (y — b) 2 — r 2 — 0. To find these 
numbers (i.e. to define the circle) we must have 
three conditions, which may be the conditions 
in (7.10.5). 

7 - 30 An osculating circle 2 of a curve T can 
also be defined as follows. Consider the circle S 
that passes through three points Af 2 , 

and M s of curve T that lie close to point M. 
When all three points move toward M , circle S 
tends to the osculating circle 2 of curve V 
at point M. 





sign as we pass through point M (i.e. 
as dx changes sign), the osculating circle 
2 at point M of the curve T intersects 
the curve. 7 - 31 This makes it possible 
to describe the osculating circle thus. 

Let us consider various circles cr that 
touch point M of T and have at point 
M the same sense of convexity (the di- 
rection of bending), in other words, have 
the same tangent at point M as T and 
have centers that lie on the side of 
convexity of T (i.e. inside the curve) 
(Figure 7.10.4). Some of these circles 
lie completely inside T (i.e. on the side 
of convexity of T), and their curvature 
is greater than that of T. Others lie 
outside T, and their curvature is smaller 
than that of T. But there is only one 
circle that goes from one side of T to 
the other (this happens at point Af), 
and this circle S 0 lies closest to T and 
is called the osculating circle 2. 

If conditions (7.10.5) are met for a 
straight line (the tangent t to T at 
point Af), then the radius of curvature 
of T at this point (which is calculated 
via (7.10.3a)) becomes infinite and the 
curvature k vanishes. Such points M 
are sometimes called the rectifying 
points of curve T. It is clear that at 

7 - 31 Exceptions in this case are only points 
where, in addition to conditions (7.10.5), we 
have Y'" , (a)=y"' (a). Examples are the vertices 
of an ellipse (points where the ellipse intersects 
the axes of coordinates); at such points, as 
symmetry considerations suggest, the osculat- 
ing circle cannot interesct the ellipse (by anal- 
ogy, all points of a curve T that possess this 
property are called the vertices of T). 
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Figure 7.10.5 


a rectifying point the curve T has no 
center of curvature Q (this center lies 
at infinity). 

From (7.10.3) it follows that the 
signs of k and R (up till now we spoke 
only of the absolute values of k and R 
and ignored their signs) coincide with 
the sign of the second derivative y" , 
that is, characterize the sense of con- 
vexity of the curve (see Section 2.7). If 
the curvature is negative, that is, at 
point M we have a negative y", the 
curve at this point lies below the tan- 
gent at point Af , that is, the curve is 
convex (ox convex upward] Figure 7.10.5a); 
the center of curvature Q then lies 
below point M. If the curvature is 
positive, that is, y" > 0, the curve 
lies above the tangent and is said to 
be concave (convex downward ; Figure 
7.10.56); in this case the center of cur- 
vature Q lies above point M . Finally, 
if the curvature at point M vanishes, 
that is, y" = 0, then, as we already 
know, the curve at point M changes its 
sense of convexity; to one side of M 
the curve is convex upward and to the 
other it is convex downward (Figure 
7.10.5c). At such a point the curve 
passes from one side of the tangent 
to the other, changes its sense of con- 
vexity (the direction of bending), and 
the point is called a point of inflection . 
At' points of inflection, not only the 
first derivatives of the equation y = 
y (x) of curve T and of the equation 
y — Y (x) of tangent t coincide, but 


so do the second derivatives, y" = 
Y" = 0, that is, at such points the 
tangent t obeys the conditions (7.10.5); 
in other words, at a point of inflection 
M the tangent t acts as the osculating 
circle E, or M is a rectifying point 
ofT. 

For a straight line Z, which of course 
can also be considered as a special case 
of a curve, the angle a between the 
tangent to l (which is simply Z) and the 
x axis does not vary; this means that 
the curvature k of a straight line is 
identically zero (Z is simply not curved 
in any way). For a circle S, the angle 
a changes, but the rate k = da/ds at 
which a changes along the circle re- 
mains constant; in other words, we may 
say that circles (and straight lines, 
for that matter) are curves of constant 
curvature. It can be shown that these 
are the only types of such curves. 

Suppose that M ( x , y, (x)) is a point 
of a curve T. The tangent t to the curve 
at point M has a slope y' (=tan a), and 
the normal n is perpendicular to Z, 
that is, forms an angle (3 = a + 90° 
with the x axis (see Figure 7.10.3); the 
slope of the normal is tan P = — cot a 
= —My' . Therefore, the (directed) pro* 
jections MP and PQ of the axes of 
coordinates of the line segment MQ 
of length R are, respectively, p = 
R cos P = — R sin a and q = R sin $ 
= R cos oc, which means that the coor- 
dinates of the center of curvature Q aro 

| = X + p = x - R sin a, 
x\ = y + q = y + Rc os a. 


If now we take into account formula 
(7.10.3a) for R and the obvious fact 
that 


cos a — 
sin a = 
we final 
£ = * — 

n = y + 


1 

1 

y 1 +tan 2 a 

V i+a/') 2 ’ 

tan a 

y' 

y 1 + tan 2 a 

V'l+UO 2 ’ 

lly get 


V [l +(</') 2 ] 3 

y' 

y" 

V 1 +(y') 2 

V [l+0/') 2 l 3 

l 


V l + (*/T 2 
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that is, 

l = x _±m.y. 


= v 


1+0/') 2 
y" ■ 

(7.10.7) 


(This agrees with the condition that 
r) > y for k > 0, or for y " > 0, or the 
condition that v\ < y for k < 0, or for 

If the curve is specified parametric- 
ally, x = x (t) and y = y (t), we can 
write (see (7.10.3b)) 


l = x — 

■n=-!/ +- 


00 2 +0/T 

x'y" —x"y' 

(sT+G/') 2 a 

x'y"—x"y' 


y , 


(7.10.7a) 


(see Exercise 7.10.4). 

The locus of the centers of curvature 
Q (£, r)) of a curve T constitutes a new 
curve y known as the evolute of curve T. 
The curve T with respect to its own 
evolute y is called the involute. 1 32 

Example 1. The parabola y = x 2 . 
Here y' — 2x and y" = 2 (a constant), 
whereby the curvature k and the radius 
of curvature R of the parabola at 
point M (x, y) (=Af (x, x 2 )) are 


y, - 2 o V (1 +4x2)3 

V(i+ 4z 2 ) 3 ’ 2 

The center of curvature Q (|, ri) of 
the parabola with respect to point 
M (x, x 2 ) has coordinates 

| = x • 2x = — 4x 3 , 

T) = x 2 -|- Lt^!L - 3a;2 4- -i- , 

that is, 

= ° r “ = CT1 ‘ /2 ’ 

(7.10.8) 


7 - 32 The word “involute” is the Latin for 
“unwinding”, while the word “evolute” is the 
Latin for “unwound,” that is the curve that 
is being unwound; these names are related to 
the properties of evolutes and involutes dis- 
cussed at the end of this section (see the 
text in small print). 



Figure 7.10.6 

where c = — 4/3 ]/ 3 and r]i = r\ — 
1/2. 

Thus, the evolute of y = x 2 is the 
semicubical parabola (7.10.8) whose cusp 
coincides with point (0, 1/2) and whose 
axis of symmetry is the axis of ordinates 
(Figure 7.10.6). 

Example 2. The cycloid x = 
a (t — sin t), y = a (1 — cos t). We have 
x' = a (1 — cos t), y' = a sin t , x" = 
a sin t, and y" = a cos t (the primes 
denote derivatives with respect to par- 
ameter t). Whence, 

(x') 2 + (y r ) 2 = a 2 (1 — cos t) 2 + a 2 sin 2 \t 
~ a 2 ( 2 — 2 cos t) = 4a 2 sin 2 ^- , 

y"x' — x"y' = a cos t-a (1 — cos t) 

— (a sin t) 2 = a 2 (cos t — 1) 

= -2a 2 sin 2 

Thus, in view of (7.10.7a) we have 

£ = a (t — sin £) + 2a sin £ 

= a (t + sin £), (7.10.9) 

rj = a (1 — cos £) — 2a (1 — cos t) 

= — a + a cos £. 

To understand what curve is repres- 
ented by these equations, we introduce 
a new parameter, = t + n (or t = 
t 1 — n); since sin t — — sin t and 
cos t — — cos £ 1? we have 

? = a (U — sin U) — an, 

)! o (7.10.10) 

T) — a (1 — cos t x ) — 2a. 
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Thus, the evolute (7.10.9) (or (7.10.10)) 
of a cycloid is also a cycloid , = 

a (t x — sin t x ) and % = a (1 — cos f x ), 
where = i + an and r\ t — r) + 2 a, 
only shifted in relation to the initial 
cycloid by 2 a units downward (2a is 
the diameter of the generating cirle) 
and na units to be left (jc a is the half- 
width of the arch of the initial cycloid) 

^’jCgUYb 

To discuss the properties of evolutes and 
involutes in detail, we must first differentiate 
in (7.10.6): 

d\ — dx — R cos a da — sin a dR, 
d^ — dy—R sin a da + cos a dR. 

But since (see (7.9.2b)) 
ds 

dx — cos a ds=cosa — — da = R cos a da , 
da 

ds 

dy = sin a ds= sin a-z — da — R sin a da, 

act 

we have 

d'B, = — sin a dR, dr\ = cos a dR. 

(7.10.11) 

Dividing one by the other, we get 
-^1L= — cot a = tan (a + 90°). (7.10.12) 

From this it follows that a tangent to evolute 7 
of a curve T (i.e. to the locus of the centers of 



Figure 7.10.8 


curvature Q (§, t|)) forms an angle p — a 4- 
90° with the x axis, that is, coincides with 
the normal to T. And since point Q belongs to 
the normal n to curve T at point M, the respec - 
tive tangent to y coincides with n, that is, 
tangents to the evolute y of a curve T coincide 
with the normals to T (Figure 7.10.8). 7 - 33 

On the other hand, from (7.10.11) it fol- 
lows that 

d^ 2 -\-dr\ 2 = sin 2 a dR 2 -\- cos 2 a dR 2 — dR 2 , 

i.e. do=dR, (7.10.12a) 

where a is the length of the arc of evolute y 
of curve T (both do and dR are assumed pos- 
itive). 

The last remark has the following meaning. 
Let us consider an arc M X M 2 of curve T such 
that on it the derivative R' = dR/ds retains 
its sign. We will agree to reckon the arc length 
along T in such a manner that the derivative 
of R is positive ; in other words, we assume that 
the arc length is reckoned along T in the di- 
rection in whicn tne ra’dius ol curvature 
grows . Then the increment M 2 Q 2 — ^iQi = 
7? 2 — R ± = AT? of the radius of curvature 
along arc M 1 M 2 is equal to the length A o 
of the evolute arc. And since the line segments 
MiQ 1 (=R 1 ) and M 2 Q 2 (=R 2 ) are tangents to 
evolute 7 of curve T (see Figure 7.10.8), the 
process of reconstructing the involute T from 
its evolute 7 resembles the unwinding under 
tension of a nonexpandable string from a spool 
shaped like 7; the end M of the string de- 
scribes curve T. That is the origin of the names 
evolute and involute (see footnote 7.32). 

By way of an example, let us construct the 
(parametric) equation of the involute of a 
circle of radius r and equation x 2 + y 2 = r 2 , 
or x — r cos t and y = r sin t , where t is 
the parameter (angle). Let us assume that the 
involute passes through point A (r, 0) and 
that the unwinding of the string wound on 
the round spool occurs counterclockwise (Fig- 
ure 7.10.9). The tangent MT to the circle 
at point M (r cos t , r sin t) forms an angle 
a = t + 90° with the x axis, and the line 
segment MT ol length rt along the tangent 
( rt is the length of arc AM ol the circle) has 
projections rt cos a = — rt sin t and rt sin a = 
rt cos t on the axes of coordinates. This read- 
ily yields the following parametric expres- 
sions for the coordinates X, Y of point T (X, 
Y) (i.e. the equation of the involute of the 
circle): 

X — r cos t—rt sin t , 

Y = r sin t + rt cos t. (7.10.13) 

The theory of evolutes and involutes was 
first developed by Christian Huygens in his 


7 . 33 In this connection it is often said that 
evolute 7 of curve T is the envelope of the 
normals to T. 
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Horologium Oscillatorium (1673) in connection 
with the following problem. We wish to con- 
struct the equation of motion of a pendulum, 
that is, a material particle of mass m (the bob) 
moving along a curve y: 

d2 s 

m ~fa 2 ~~mg sin a, (7.10.14) 

where s is the arc length of curve y and t is 
time, so that d 2 s/dt 2 is the (tangenrtial, i.e. 
directed along the tangent to y) acceleration 
of the particle, g is the acceleration of grav- 
ity, a is formed by the tangent l to curve y 
and the horizontal, and mg sin a is the (ver- 
tical) projection of the force of gravity mg 
on l (Figure 7.10.10; see Chapters 9 and 10). 
The period of oscillations of the pendulum is, 
strictly speaking, dependent on the swing 
(i.e. on the uppermost point on curve y at 
which the pendulum starts oscillating). Huy- 
gens posed the question of what shape should 
curve y have so that the pendulum bob swing- 
ing along this curve would take exactly the 
same time to complete swings of large and of 
small amplitude. By analyzing the differen- 
tial equation (7.10.14) (which was not a sim- 
ple task because at that time there was no 
differential and integral calculus) Huygens 
found that y must be a cycloid . But how do we 
make a pendulum move along a cycloid? 
It was in this connection that Huygens devel- 
oped the general theory of involutes and found 
that the evolute of a cycloid is also a cycloid 




Figure 7.10.11 

(see Figure 7.10.7 and Eq. (7.10.10)). He 
also demonstrated that if at the point of sus- 
pension of the string of the pendulum certain 
“lips” in the form of a cycloid are attached, so 
that in the process of oscillations the string 
winds about them, then the end of the string 
(with the bob) of constant length describes a 
cycloid (Figure 7.10.11). The first pendulum 
clock constructed by Huygens had just such 
“lips”; however, later he dropped this idea 
because he found that the “lips” had practi- 
cally no effect on the precision of the clock, 
since in view of the smallness of angles a 
in Eq. (7.10.14) we can always replace sin a 
with a and the equation m ( d 2 s/dt 2 ) = mg a 
implies that the period of oscillations of a 
pendulum of constant length l is constant (see- 
Section 10.3). 


Exercises 

7.10.1. Calculate the curvature (in an ar- 
bitrary point of the curve) of (a) an ellipse 
x = a cos t and y = b sin t, (b) a hyperbola 
y = 1/x, and (c) a fourth-order parabola y = 
ax 4 . 

7.10.2. Find the evolutes of (a) an ellipse 
x = a cos t and y — b sin f, and (b) a hyper- 
bola y = 1/x. 

7.10.3. Prove that the points of maxima 
or minima in the curvature of a curve corres- 
pond to the cuspidal points of the evolute of 
this point (see Figures 7.10.6 and 7.10.7). 

7.10.4. Prove the validity of formulas 
(7.10.7a). 

7.11 Solid Geometry Applications 
of Integral Calculus 

In Section 3.6 we obtained the formula 

b 

V=^S(x)dx, (7.11.1) 

a 

where S (x) is the area of a section of 
a solid by a plane x — constant per- 
pendicular to the x axis (we advise the 
reader to repeat the derivation of this 
formula). This formula was used to 
obtain an expression for the volume of 
a pyramid . The volume of a cone was 
obtained in exactly the same manner. 
Place the origin of the system of coor- 
dinates at the center of the circle of the 


Figure 7.10.10 
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Figure 7.11.1 

cone base and send the x axis along 
the altitude of the cone upward (Figure 
7.11.1). Let S (. x ) be the area of the 
section of the cone by a plane per- 
pendicular to the altitude and distant 
x from the base. This section is a circle 
of radius r x . From the similarity of 
triangles we have rjr = (H — x)/H, 
where r is the radius of the base and H 
is the altitude of the cone. Whence, 
r x = ( rlH ) (H — x) and S (x) = nr 2 x = 
nr 2 (H — x) 2 IH 2 and, consequently, 

H 

V = ] n H-x) z dx 

0 

nr 2 (H — x) 3 H __ Ttr 2 H 3 nr 2 H 
H 2 3 o ~ 3 H 2 "" 3 * 

To obtain the volume of a sphere of 
radius R, we place the origin at the 
center of the sphere (Figure 7.11.2). 
The section cut by a plane perpendicular 
to the x axis and distant x from the 
origin is a circle of radius R x . By the 
Pythagorean theorem R x = Y R 2 — x 2 
and so S {x) = nR 2 x = n {R 2 — x 2 ), 
whence 

R 

V= | n(R 2 — x 2 )dx 

-R 

Formula (7.11.1) results in CavalierV s 
theorem, 1 ** which says that if two solids 

7. 34 This theorem played an important role 
in formulating the concept of the integral. 
Bona ventura Cavalieri (1598-1647), a pupil of 
Galileo, gave this theorem (without proof) 
in his Geometria hidivisibilibus (The Geometry 
of Indivisibles) (1635). 



Figure 7.11.2 


have equal altitudes and if sections made 
by planes P and Q parallel to the bases 
and at equal distances to them always 
have a given ratio , the volumes of the 
two solids have this given ratio to each 
other. Indeed, in calculating the vol- 
umes of such solids using formula 
(7.11.1), where the x axis is perpendic- 
ular to planes P and Q, we see that the 
ratio of the volumes is the same as the 
ratio of the integrands S (x). 

Let a solid be generated by revolving 
the curvilinear trapezoid depicted in 
Figure 7.11.3 about the x axis. In this 
case, the section is a circle of radius 
y = / (x) and S (x) = ny 2 . Using 
(7.11.1), we find the well-known formula 
for the volume of a solid of revolution : 

b 

V —n \ y z dx. (7.11.2) 

a 

Let us find, say, the volume of a solid 
generated by the revolution of the 
upper half of the ellipse x 2 /a 2 + y 2 /6 2 = 
1 about the x axis (make a drawing). 
This solid is called an ellipsoid of 
revolution. Since for an ellipse y = 
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The surface area of a sphere is readily 
found by means of this formula. Indeed, 
a sphere is generated by the revolution 
of the upper semicircle about the x 
axis. The equation of a circle is x 2 + 
y 2 = j? 2 , whence 


y •■= V R 2 —x 2 , 


Yr 2 —* 2 

(7.11.4) 


Figure 7.11.4 


Substituting into (7.11.3) yields 


(b/a) Y a 2 — x 2 , formula (7.11.2) yields 


a 

V = jt j ~ (a? — x 2 ) dx 

- a 



For a — b = R we obtain, as expected 
the volume (4/3) nR 3 of a sphere of 
radius R. 

Now let us derive the formula for the 
surface area of a solid of revolution 
(Figure 7.11.4). We consider a solid 
bounded by sections passing through 
the points x and x + dx. Denote by dF 
the lateral surface area of this solid. 
Regarding it as the frustrum of a cone, 
we get 

dF ~ a [y (a:) + y (x + dx)] ds, 


R 

F = 2n\ Vi ? 2 — z 2 * - dx 
J Y 

-R 

= 2nRx\ R — 4nR 2 . 

\-R 

Moreover, from (7.11.3) and (7.11.4) 
we can easily find the surface of a 
spherical layer {spherical zone) cut out 
of a sphere by two parallel planes 
x = a and x = b > a (see Figure 7.11.2 
if, say, b — R, the layer becomes a 
spherical segment or spherical cap). In- 
deed, we have 

b 

F = 2n [ /i? 2 -x 2 -£M= 

J Yr 2 —* 2 

a 

= 2nRx | & -- 2nR {b — a). 

[a 


where ds is the length of the small por- 
tion of the curve by revolving which 
we get the surface of the solid; ds = 
yr+WW 2 dx (see Section 7.9). The 
sum y (x) + y (x + dx) can be replaced 
with 2 y (x), disregarding the quantity 
y {x) dx as compared with y (a:). 7 * 35 
Therefore 

dF ~2ny yx) Yi + (y' {x)) 2 dx. 

The entire surface area of the initial 
solid of revolution is 

b 

F=2n J y(x)Y 1 + W Wf 2 dx. (7.11.3) 

a 

7 - 35 Note that in the expression dF of the 
sum y (x) + y (x + dx) is multiplied by ds , so 
that the quantity we ignore is of the order of 
dx ds c^. dx 2 . 


Thus, the sough t-f or surface area 
depends only on the altitude b — a of 
the layer (segment) and the radius R 
of the sphere, which is really a remark- 
able and beautiful result. 

Formulas (7.11.2) and (7.11.3) can be 
written also in a different way. Suppose the 
object depicted in Figure 7.11.3 is a metal 
plate of constant thickness. In this case the 
mass M of this plate is 
b 

M = p j ydx = \iS, (7.11.5) 

a 

where \i is the density of the material of the 
plate (density per unit surface area; see Sec- 
b 

tion 9.13), and S — ^ ydx is the surface area 

a 

of the plate. The distance yc from the center 


( 8-0946 
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Figure 7.11.5 

of gravity to the x axis is expressed by the 
formula 

b b 

Vc j ^ dx ~~w j yidx ( 711 - 6 ) 

a a 

(see (9.13.12)). Comparing (7.11.2) and (7.11.6), 
we see that 



S ( x ) = 3X [y t (x)] 2 — jx [y 2 (a;)] 2 

= n {y\ — 

and, therefore, the volume of the solid of rev- 
olution is 


V = 2n Sy c = 2 ny c S, (7.11.7) 

or that the volume V of a solid revolution is 
equal to the area S of the figure whose rotation 
generates the solid multiplied by the circumfer- 
ence 2ziyc of the circle described in the process 
of rotation by the center of gravity of the figure. 
(Here the center of gravity, or centroid, of a 
flat figure is assumed to coincide with the 
center of gravity of a homogeneous plate whose 
shape would be that of the figure.) This theo- 
rem is known as the (first) theorem of 
Pappus . 7 * 36 

Formula (7.11.7) and the theorem of Pappus 
can be modified so that they incorporate 
the case of a solid generated by the rotation 
of a figure of arbitrary shape, and not only 
a curvilinear trapezoid. 

Take, for example, the solid generated by 
the rotation about the x axis of the figure de- 
picted in Figure 7.11.5. In this case the section 
of the solid by a plane perpendicular to the x 
axis is a circular ring (annulus) bounded by 
the circles of radii y 1 (x) and y 2 (x). Hence 


7 36 Pappus of Alexandria (end of 3rd cen- 
tury A.D.) was the last of the great Greek 
mathematicians; in his works he set forth 
(usually without any proof) many results ob- 
tained by his predecessors and also by himself. 
The fact that the theorems of Pappus are often 
called the Pappus-Guldin theorems or even 
simply Guidin' s theorems in the literature is 
completely groundless. Both theorems were 
formulated by Pappus, while their proofs, put 
forward by Paul Guldin (1577-1643), a Swiss 
monk and an amateur mathematician, were 
highly doubtful and substantially weaker than 
the proofs of the theorems given by B . Cavalieri 
and the famous Johann Kepler (1571- 
1630) (the latter proofs were harshly, but un- 
justly, criticized by Guldin). 


b 

V=n ^(y\—yl)dx. (7.11.8) 

a 

On the other hand, the distance yc from the x 
axis to the center of gravity (centroid) C of 
a homogeneous plate shaped as K in Fig- 
ure 7.11.5 is 

b 

1 b i (vl —Vl)dx 

yc = ~2s J 

° 2 j (vi-y^ dx 

(7.11.9) 

(see 9.13.11)). Hence here also V = 2nycS. 

Formula 7.11.3) can be interpreted in a 
similar manner. Suppose that the surface of 
a solid of revolution is generated by the rota- 
tion of an arc AB (see Figure 7.11.4) and imag- 
ine that the arc is a homogeneous heavy string 
with a constant density a (mass per unit 
length; see the text between formulas (9.12.1) 
and (9.12.2)). In this case the distance yc 
from the center of gravity of the string C 
to the x axis is given by the following for- 
mula: 


si Si 

Vc ~~m~ j °y ( x ) ds= ~r f 


y (x) ds 


o 

— j v(x) VI + (»' (z)) 2 dx; (7.11.10) 

a 

here a:(5 0 ) = a, ;r( 5l ) = & ? ds is the element of 

si b 

length of arc AB, L= jds= j V 1 +(i>'(*)) 2 *: 
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Figure 7.11.7 

is the length of the string, and M — oL is 
the mass of string (see formulas (9.13.4) and 
(9.13.5b)). 

Comparing (7.11.3) and (7.11.10), we get 

F = 2n Ly c = 2ny c L , (7.11.11) 

which means that the surface area of a solid of 
revolution is the product of the length L of the 
arc whose rotation generates the surface by the 
circumference 2nyc of the circle described by 
the centroid of the curve in the process of rotation. 
This result (known as the second theorem of 
Pappus) can be generalized so as to incorporate 
the case of a curve of arbitrary shape (say the 
one that is the boundary of figure K in Fig- 
ure 7.11.5) and not only arc AB (see Figure 
re 7.11.3), which is the boundary of a curvi- 
linear trapezoid. 

Thetheoremsof Pappus (7.11.7) and (7.11.11) 
often prove very useful. For instance, let us 
find the volume and surface area of a torus 
{anchor ring), which is generated by the rota- 
tion about the x axis of a circle of radius r 
whose center is d distant from the axis of ro- 
tation (Figure 7.11.6). Here, obviously, the 
center of gravity C of the circle (and of the 
circumference of the circle) coincides with the 
(geometrical) center Q of the circle. Hence, the 
volume of the torus in Figure 7.11.7 is 

V = 2nd X nr 2 = 2 n 2 r 2 d, (7.11.12) 

and the surface area is 
F — 2nd X 2nr = An 2 rd. (7.11.13) 

The theorems of Pappus can be used in the 
reverse direction, so to say. The sphere in 
Figure 7.11.2 can be obtained by rotating a 
semicircle about the diameter that is part of 
the boundary of this semicircle. Then, if the 
radius of the sphere is R, the volume will be 

Since we know that the volume V of a sphere 
is (4/3) Jii? 3 , we can find the distance yc 
from the center of gravity of a semicircle to 
the diameter: 

yc n R»/(n*R*) = ~ OAR. 

Similarly, using the second theorem of 
Pappus, we can find the center of gravity of 
a semicircumference (see Exercise 7.11.3). 


Exercises 

7.11.1. Find the volume of a cone using 
the fact that the cone is a solid generated by 
rotating a right triangle about one of its legs. 

7.11.2. Find the volume of a solid generat- 
ed by rotating a figure bounded from above 
by the curve y = ^ x, from below by the x 
axis, and on the right by the vertical x = 2 
about the x axis. 

7.11.3. Using the second theorem of Pap- 
pus, find the center of gravity of a semicir- 
cumference. 


7.12 Curve Sketching 

The most primitive method of construct- 
ing the graph of a function / (x) is to 
compute the value of / {x n ) for a large 
number of points x n . The usual proce- 
dure is to choose points x n in the form 
x n = x 0 + na , with n = 0, ±1, ±2, 

. . .. This, clearly, is quite an extrava- 
gant method. In order to see the variation 
of the function on an interval Ax , we 
have to choose a step a much less than 
Ax , or a <C Ax . And if the step (subin- 
terval) is small, a very large number of 
points are required to embrace the whole 
range we are interested in. And yet we 
encounter the problem of constructing 
curves very often, since knowing the 
graph of a function provides extensive 
information about the properties of the 
function; for one, it immediately gives 
the number of real roots of the equation 
/ {x) — 0, gives the intervals within 
which these roots lie, shows the maxima 
and minima of the function. 

The techniques considered in Sec- 
tions 7.1 and 7.2 make it possible to 
construct graphs much faster and more 
reliably and to gain a general picture 
of the shape of the curve. This requires, 
first of all, that we find the character- 
istic points of the graph— maxima, 
minima, discontinuities, salient points, 
points of inflection, etc. 

Let us illustrate this fact by using 
the example of the graph of a third- 
degree polynomial , 

y = ax 3 + bx 2 + cx + d. 


(7.12.1) 
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Figure 7.12.1 

To be specific, we wish to construct the 
graph of the function 

y = 0.5x 3 + 0.75a; 2 - 3x + 2.5. 

(7.12.2) 

First we find the maxima and minima. 
Equating to zero the derivative of 

(7.12.2) , we have 

y ' = 1.5a; 2 - 1.5a; -3 = 0, (7.12.3) 

•whence we find the two roots of Eq. 

(7.12.3) : x 1 = —1 and x 2 == 2. 

Let us investigate each of these values 
separately. To do this, we find y". We 
see that y" = 3x — 1.5, and y" (—1) = 
— 4.5 <C 0 and y" (2) = 6 — 1.5 = 
4.5 > 0. This means that at x = —1 
the function has a maximum, 

z/max — — 0*5 — 0.75 + 3 + 2.5 
= 4.25, 

while at x = 2 the function has a min- 
imum , , y m:n = — 2.5. 

Now let us see how the polynomial 
behaves for very large x (in absolute 
value). Note that for very large x the 
term containing x 3 will appreciably 
exceed the other terms in absolute value. 
Therefore, the sign of the polynomial 
(7.12.2) is determined by the sign of 
0.5a; 3 (for large absolute values of x): 
for x 0 we have y » 0, that is, the 
right-hand branch of the curve goes 
up, while for x<0 we have y < C 0, 
that is, the left branch goes down. (It is 
clear that if a in (7.12.1) is negative, 
the left branch goes up and the right 
branch goes down.) 
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Figure 7.12.2 

Let us find the points of inflection , 
that is, points, where /" (x) — 0 (see 
Section 7.10). From (7.12.3) it follows 
that y" = 3x — 1.5, that is, y” = 0 
only at x = 0.5. Note that the graph 
of a third-degree polynomial always 
has a point of inflection, and it is 
unique (at this point the tangent to 
the graph intersects the graph; see 
Figure 7.12.1). Indeed, if y is a third- 
degree polynomial, then the equation 
y" = 0 is a first-degree equation. It 
always has a unique root, x 0 . At this 

point the difference y — y, where y is 
the ordinate of the tangent to the curve, 
has the form A (x — x 0 ) 3 , that is, 
changes sign when x passes through the 
value x 0 . This means that at this point 
the tangent is sure to intersect the 
curve. 

We return to the construction of the 
graph and compute the ordinate y of the 
point of inflection to get y = 0.875. 
Let us also determine the direction of 
the tangent to the curve at the point 
of inflection. Using (7.12.3), we get 
tan a = y'( 0.5) = 3.375. Using all the 
foregoing arguments, we get the shape 
of the curve (7.12.2) depicted in Fig- 
ure 7.12.2. 

Of course, if we do not compute any 
other values of the function, the result- 
ing graph will give only a very rough 
qualitative idea of the behavior of the 
function, but even such a graph enables 
us to count the number of roots (i.e. 
the number of points of intersection 
of the graph with the x axis) and to 
draw certain conclusions about their 
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Figure 7.12.3 

values. In our example, Figure 7.12.2, 
we see that there are three roots, that 
one of the roots lies somewhere between 
0.5 and 2, that the second root must 
definitely be positive (it even exceeds 2), 
while the third root is negative (it is 
less than — 1). 

The graph may be improved by comp- 
uting a few more values of the function 
for certain values of x. For example, let 
us compute three more values of the 
function in this example. For x = 0 
we have y = 2.5. This permits us to 
get a better picture of the variation 
of the curve between maximum and 
minimum. For x = 3 we have y = 
0.25. We computed this value so as to 
get an idea of the rate of climb of the 
right branch of the curve. Similarly, 
to get an idea of the rate of fall of the 
left-hand branch of the curve we take 
x = — 2 and get y = 1.5. Using these 
values, we obtain the curve shown in 
Figure 7.12.3. 

With this graph we can draw more 
accurate conclusions concerning the 
roots: one root lies between x = 0.5 
and x = 1, the second between x = 2 
and x — 3 (closer to 3), and the third 
is less than x = —2 (its value is most 
likely close to x = —2.5). 

It may happen that after equating 
the derivative to zero we will not 
obtain any real roots. This will mean 
that the polynomial does not have a 
maximum or a minimum. Since all 
that has been said about the behavior 
of the polynomial for very large abso- 
lute values of x remains valid, the graph 
will intersect the x axis only at one 


point (the polynomial has one real root; 
see Figure 7.12.1). 

Finally, the derivative of the poly- 
nomial may have only one (double) root 
x 0 . Then the derivative will be of the 
form 

y' = A (x — x 0 ) 2 , (7.12.4) 

whence, integrating, we get 
y = ±(x-x 0 )^C. (7.12.5) 

From (7.12.5) we see that in this 
case the polynomial differs from a per- 
fect cube only in a constant summand. 
It is clear that y has neither a maximum 
nor a minimum (see Example la in 
Section 7.1). The graph of this function 
intersects the x axis at one point, which 
can be found by equating y to zero: 

A.( x - Xo y+c= o, 

that is, 

/ 3C .VIST 

(x~ x 0 ) 3 = j- , x = x 0 —y -j-. 

(7.12.6) 

Finding the maximum and minimum 
of a third-degree polynomial (7.12.1) 
and, hence, the investigation of its 
graph can always be completed because 
by equating the derivative to zero we 
get a quadratic equation whose roots 
are not hard to find. (In general, how- 
ever, the situation is not all that 
simple.) A polynomial (of any degree) is 
preferable over all other functions in 
that we can always find its value for 
any value of x and its graph has no 
discontinuities or corners; when x in- 
creases without bound in absolute value, 
the absolute value of y also increases 
without bound (and the higher the 
degree of the polynomial, the greater 
the rate of this growth). But for an 
arbitrary function (even if it is algebra- 
ic) the situation may be quite differ- 
ent. 

In the general case, the construction 
of a graph requires studying the behav- 
ior of the function at infinity, that is, 
the behavior of the function asx-> — oo 
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and x -> oo. If, say, the function comes 
closer and closer (in value) to a fixed 
number C as x increases without bound, 
the function has a horizontal asymptote 
y = C, that is, as x grows, the graph 
comes closer and closer to the straight 
line y = C. Similarly, if at x = a the 
function undergoes a discontinuity 
and, say, grows without bound as x 
approaches a, the straight line x = a 
is a vertical asymptote of the graph of 
the function y = / (#), that is, the graph 
moves closer and closer to the straight 
line x = a as x approaches a. It is also 
clear that the construction of a graph 
must begin by determining the domain 
of the function (the range of the indep- 
endent variable); in addition, we must 
determine whether or not the function 
is even or odd (it may be neither) (see 
Section 1.7), since in the case, say, of 
an even function we need only construct 
the branch of the graph corresponding 
to positive values of x and then apply 
symmetry considerations. 

Let us illustrate what we have just 
said by several simple examples. Take 
the function 

P.12.7) 

It is clear that this function is defined 
for all values of x except x = — 1, where 
the denominator vanishes and the func- 
tion undergoes a discontinuity. Since as 
x approaches — 1 the absolute values 
of y increase without bound, the straight 
line l whose equation is x = — 1 is the 
vertical asymptote of the graph of the 


function. It is also easy to see that 
near x = — 1 the value of y is positive 
to the left of l and negative to the right 
of l (Figure 7.12.4). 

Further details of the behavior of the 
function are related to the fact that 
the values of 

2/ = l--7TT ( 7 - 12 - 8 ) 

for large absolute values of x will get 
as close to unity as desired (since the 
fraction 21 (x + 1) becomes very small 
in this case). Therefore, the straight 
line y = 1 serves as the horizontal 
asymptote of the graph of the function. 
Finally, by virtue of (7.12.8), 

/ __ 2 4 
^ “ (Z + 1) 2 ’ y ~ ““ («+l)» * 

that is, y r is positive for all values of x 
from the domain of the function, while 
y" is negative for x > —1 and positive 
for x <C — 1. Thus, for all values of x 
(except x = — 1 , for which the function 
is not defined) our function increases ; 
for x > — 1 the graph is convex upward , 
while for x <Z — 1 it is convex down- 
ward ; the graph has no points of in- 
flection (i.e. points where y" vanishes). 
It is also important that y = 0 only 
at x = 1, which means that the graph 
intersects the x axis at point (1, 0), 
while £ = 0aty = — 1, which means 
that the graph intersects the y axis at 
point (0, —1) (Figure 7.12.4). 

Here are two somewhat more diffi- 
cult examples. If 

» = -fcr PJ2-9) 

the investigation is simplified by the 
fact that the function is even: the 
graph is symmetric about the y axis. 
Our function is defined for all values 
of x except x = 1 and x = — 1; at 
these points the function becomes in- 
finite. Thus, the straight lines x = 1 
and x = — 1 are the vertical asymptotes 
of the graph of the function. On the 
other hand, the straight line y = 1 
is the horizontal asymptote (see the ex- 
pression within the parentheses in 



7.12 Curve Sketching 


279 


(7.12.9) ). It is also clear that the equa- 
tion y (x) = 0 has two roots: x = 
± y 2, that is, x ~ 1.41 and x ~ 
—1.41. 

If we employ the second form of 

(7.12.9) , we readily see that 

^ , _ 2 x _ _ 2 2-2i-2j 

y ^2 — 1)2 ’ y (x 2 — l ) 2 (a : 2 — l ) 3 

6z 2 + 2 

— (* 2 — l ) 3 * 


Thus, the derivative y' vanishes only 
at x = 0; since y" (0) = 2> 0, the 
value x = 0 corresponds to a (local) 
minimum in the function, y mln = 
y (0) = 2. For negative x the function 
decreases and for positive x it increases; 
it has no points of inflection because 
y" (x) is never zero; at x 2 < 1, that is, 
for —1 C x C 1, the second derivative 
is positive, or the function is convex 
downward, while for | x | > 1 the 
function is convex upward. The graph 
of the function is shown in Figure 
7.12.5. 

Finally, we take the function 


y=]A 


z 2 — 2 
x 2 — 1 = 
1 



(7.12.10) 


whose domain coincides with the do- 
main of positivity of function (7.12.9), 
that is, the function is defined only for 




| x | < 1 and | x | > Y2 ~ 1.4. Since 
the values of this function can be found 
by extracting the square root of the values 
corresponding to the function depicted 
in Figure 7.12.5, we can immediately 
construct the graph (Figure 7.12.6). A 
fuller idea of the behavior of the curve 
can be obtained by calculating several 
values of y (x) corresponding to certain 
points on the curve or the values of 
y' (#), which give the slopes of the 
tangents to the curve at various points 
( x , y (^)) (for instance, for the curves 
in Figures 7.12.5 and 7.12.6 it is 
advisable to find the slope y' (x) of the 
tangent to point (1.41, 0)). 

Of course, the details with which a 
specific curve is built depend on the 
reasons that prompted us to investigate 
the curve in the first place; in some 
cases it is quite sufficient to obtain a 
rough qualitative picture of this behav- 
ior, while in others we may wish to 
obtain a fuller idea of the behavior. 

What we have discussed here in 
connection with curve sketching per- 
tains, of course, to “school” methods of 
constructing curves that are specified 
by equations. Quite different possibil- 
ities have been opened up by the use 
of electronic computers. It is possible 
to program a computer in such a way 
that it plots the curve of a function 
on the monitor screen when you key 
in the various values of the function 
and those of the independent variable. 
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Exercises 


7.12.1. Find the maxima and minima of the 
following functions and plot the graphs of these 
functions: (a) y = x 3 — 3a: 2 + 2, (b) y = 
x 3 — 3a: 2 + 3x — 15, and (c) y = x 3 — 3a: 2 + 
6a: -f- 3. 

7.12.2. Determine the number of real roots 
of the following equations: (a) 2a: 3 — 3a: 2 — 


12* + 15 = 0, (b) 4a: 3 + 15a: 2 - 18* - 2 = 
0, (c) 2a: 3 — x 2 — 4a: + 3 = 0, and (d) x 3 — 
x 2 -j- 2 = 0. 

7.12.3. Construct the graphs of the fol- 
lowing functions: 

(a) y = ■ - J z _ f t (b) y 2 = x 3 (1 — x), (c) y 2 = 

a: 4 (l + a:), and (d) y* = x 2 ( 1 — x) 2 . 
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Chapter 8 Radioactive Decay and Nuclear Fission 


8.1 The Basic Characteristics 
of Radioactive Decay 

The basic law of radioactive decay , 
which was established in experiments, 
states that the ratio of the number of 
atoms disintegrated in unit time to the 
total number of atoms is a constant 
that depends only on the species of 
atom. It is understood that the total 
number of atoms is extremely large. 

This ratio is called the probability 
of disintegration. Denote by N(t) the 
quantity of atoms that have not disin- 
tegrated by time t. At time t + dt 
there will be N(t + dt) untransformed 
atoms. For this reason, during time dt 
(from t to t + dt) there will be N(t) — 
N(t + dt) ~ —dN disintegrations of 
atoms. If we divide the ratio of the 
number — dN of atoms that have disin- 
tegrated during the time interval dt 
to the total number of atoms N (this 
ratio is the fraction of disintegrated 
atoms) by dt , we get the probability 
of disintegration co = —dNIN dt , from 
which it follows that 

< 8 - M ) 

From this relation, recalling that the 
dimensions of dNIdt are the same as 
those of the ratio N/t, we see that the 


dimensions of the probability of disin- 
tegration to are 1/s. 8 * 1 

The initial condition consists in speci- 
fying the number of atoms at the initial 
time: N = N 0 at t — t 0 . 

Solving Eq. (8.1.1) by the method 
given in Section 6.6 (see Eqs. (6.6.9) 
and (6.6.10)) and using the initial con- 
dition, we find that 

N ( t ) = AV'®* (8.1.2) 

(we advise the reader to perform all the 
computations). However, when the de- 
rivative is proportional to the desired 
function, a simpler solution to the 
equation can be offered. 

In Chapters 4 and 6 we discovered 
that the derivative of the exponential 


8 - 1 Consequently, probability here is not to- 
be understood in the sense of the assertion 
that, as in coin tossing, the probability is 
one half that the coin will fall heads. The 
definition of the probability of disintegration 
as the ratio of the number of disintegrations 
per unit time to the initial number of atoms 
holds true only for the case where the num- 
ber of disintegrations per unit time (say, per 
second) constitutes a small fraction of the to- 
tal number of atoms. The exact definition of 
probability of disintegration is given by the* 
formula co = — (1 /N)dN/dt, that is, the prob- 
ability of disintegration is equal, by defini- 
tion , to the ratio of the number of disinte- 
grations during a small time interval to the- 
total number of a atoms and to the magni- 
tude of the time interval. 
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function is proportional to the function 
utself: 

d(a x ) , x 

--- 7 — — — const x ar. 

dx 

In particular, 

d (Cekx) _ dtgkx 
dx 

if C and k are constants. Recalling this 
property of the exponential function, 
let us suppose that the solution to 
Eq. (8.1.1) is of the form 

.N = Ce ht , (8.1.3) 

and let us try to choose C and k such 
that both the equation and the initial 
condition are satisfied. Differentiating 

(8.1.3) we get 

-4?- = Cke ht = kN. 

dt 

Substituting into Eq. (8.1.1) yields 
.kN = — c oA, whence k — — co. Assum- 
ing t = 0 in (8.1.3) and using the 
initial condition, we get C = N 0 . And 
so N = N 0 e ~ cot , where the quantity 

— co£ in the exponent is dimensionless, 
as it should be. 

Radioactive atoms are characterized 
by their half-life T, which is the time 
during which the number of atoms N 
diminishes via disintegration by one 
half the original amount. Let us deter- 
mine the half-life T. From (8.1.2), 
N ( T ) = N 0 e~® T . On the other hand, 
A (T) = N 0 1 2, by definition. There- 
fore i\V -0)T = NJ 2, or e -0)T = 1/2, 
that is, 

— (i>T = — In 2, T = — 

7 CO (0 

(8.1.4) 

Hence, the half-life is inversely pro- 
portional to the probability of disin- 
tegration. 

Prior to disintegration, every atom 
exists a certain period of time, which 
is called the lifetime of the atom. 

We wish to find the mean lifetime t 
of an atom of a given radioactive ele- 
ment. Suppose that at the initial time 
i = 0, say when the radioactive element 


was created, there were N 0 atoms. 
During the time interval from t to 
t + dt the quantity of atoms that dis- 
integrated was approximately 

— dN = c oA dt. 


All the atoms of this group lived rough- 
ly the same lifetime t. Among the 
atoms taken at the initial time t = 0 
there are groups of atoms that will have 
different lifetimes: from the common- 
to-all-atoms time of creation to the 
distinct-for-various-atoms time of dis- 
integration. To find the mean lifetime, 
we must multiply the lifetime of each 
group by the number of atoms in the 
group, add these quantities for all 
groups, and divide the result by the 
total number of atoms in all groups. 

Since we have to add a very large 
number of very small terms, the sum 
can be replaced by an integral, and 
therefore (compare with Section 7.8) 

oo oo 

taNdtj j aNdt. (8.1.5) 

0 0 



We substitute the expression for N 
from (8.1.2). The denominator on the 
right-hand side of (8.1.5) is 

oo oo 

^ c oN dt= j 
o o 


= (dWo j e~ at dt = 
o 


■CO An 


0 - (Of 


= N 0 , 


as was to be expected, since the integral 
in the denominator yields the total 
number of all disintegrated atoms, 
which, clearly, is equal to the number 
of atoms existing at the initial time. 

We integrate the integral in the 
numerator on the right-hand side of 

(8.1.5) by parts, setting t = /, e~ at dt = 
dg, and —(1/co) e~ = g. As a result 
we have 


oo 

coA 0 j te dt 
o 

=wN 0 ( —Lte-»*+ \ 4- I" 
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From (8.1.5) we now obtain 


7 _ l 

" (dN o “ (0 ’ 


( 8 . 1 . 6 ) 


was the content of radium in the rocks 10 000 
years ago, 10 6 years ago, 5 X 10 9 years ago 
(5 X 10 9 is the age of the earth)? 

8.2 Measuring the Mean Lifetime 
of Radioactive Atoms 


or that the mean lifetime of an atom is 
exactly the inverse of the disintegration 
probability. Using this fact, we can 
write the basic equation (8.1.1) and 
its solution (8.1.2) as 


N = N 0 e-W. ( 8 . 1 . 8 ) 

We must not forget that the time t 
is the independent variable; the number 
of atoms depends on t. On the other 
hand, f is a constant that describes 
the given species of radioactive atom. 

From (8.1.8) it is evident that during 
time t the number of atoms diminishes 
from N 0 to N 0 e _1 = Nje, by a factor 
of e, which is roughly equal to 2.72. 

By formula (8.1.7), the initial rate 
of disintegration is such that if the 
number of atoms decaying per unit 
time did not fall off, all the atoms would 
disintegrate in time t. Indeed, at t = 0 
there were N 0 atoms and the rate of 
disintegration was (dNldt) t==0 = 
— NJt . At that rate, complete disin- 
tegration requires a time equal to t. 
From (8.1.4) it follows that co = 
(In 2)/7\ and so t = 77 In 2 ~ 1.45 T. 
Computationally, the quantity t is 
more convenient than the half-life T. 

Exercises 

8.1.1. The mean lifetime of radium is 2400 
years. Determine the half-life of radium. 

8.1.2. We start with 200 grams of radium. 
How much radium will be left in 300 years? 

8.1.3. Ten grams of radium disintegrated 
in 500 years. How much was there at the be- 
ginning? 

8.1.4. Determine how much time will elapse 
for 1%, 10%, 90%, and 99% of an original 
supply of radium to disintegrate. 

8.1.5. The amount of radium in the earth 
in various rocks (the ratio of the number of 
radium atoms to the number of atoms of rock) 
comes out to about one part in 10 12 . What 


The mean lifetime t of various radio- 
active atoms is extremely diversified. 
To illustrate, let us take the several 
known isotopes of uranium. One, with 
atomic weight 238 (U 238 ), has a mean 
lifetime of 7 X 10 9 years. Another 
(U 235 ) has a mean lifetime of 10 9 years 
(the fission of uranium-235 in nuclear 
power plants is the main source of 
atomic energy). The mean lifetime of 
radium is 2400 years. 8 - 2 

However, it would be wrong to think 
that the mean lifetimes of all radio- 
active atoms are measured in thou- 
sands of years. Among radioactive sub- 
stances that occur in nature and were 
studied by Marie and Pierre Curie and 
Ernest Rutherford we find polonium 
with a mean lifetime of about 200 days, 
radium A with a mean lifetime of 4 mi- 
nutes, and radium C' with a mean life- 
time of 2 X 10 ~ 4 second. 

During recent decades, a huge 
number (over 1000) of different radio- 
active substances with a vast range of 
mean lifetimes have been discovered 
in connection with the development 
of nuclear physics and the use of atomic 
energy. 

If at time t there are N(t) untrans- 
formed atoms, then there will be n(t) = 
(x)N(t) = — dNldt disintegrations (of 
atoms) per unit time. The quantity 
n ( t ) (=n 0 e~ (3)t , with n 0 = — ( dN/dt ) t=to ; 
compare with (8.1.2)) is the rate of 
disintegration of the atoms (see 
Section 8.1). 

Suppose we observe the decay of 
uranium for ten years. During this time 
about 4 X 10 12 atoms in one gram of 
uranium will have disintegrated. It 
would be extremely hard to detect that 
the original amount of 2.5 X 10 21 atoms 

8 - 2 Reference books frequently give the 
half-life T « 0.69* instead of the mean life- 
time t\ see Section 8.1. 
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would be diminished by 4 X 10 12 atoms. 
Measuring the number of desintegra- 
tions often proves to be easier than 
measuring the number of atoms that 
have not disintegrated. 

However, by experiments involving 
radioactive substances with relatively 
short mean lifetimes (from a few mi- 
nutes to a few days) it has been possible 
to verify formula n — (which 

follows from the fact that n is pro- 
portional to N) with great accuracy 
and, thus, to corroborate formulas 

(8.1.1) and (8.1.2). To do this, let us 
calculate the number of disintegrations 
over small periods of time. Dividing 
+1 cne ^iiuiiVoei ~fi± ^ciisrAtegrAiioflo +1 c He 

time interval, we get the disintegration 
rate at various instants of time. 

We construct the graph of the disin- 
tegration rate as a function of time. 
How can we be sure that this curve 
is the graph of the exponential func- 
tion? We can compute the logarithms 
of the resulting values of the rate of 
disintegration and, on this basis, con- 
struct a graph of In n ( t ) as a function 
of time t. The result should be a straight 
line, which can be checked roughly by 
eye. Numerous experiments do indeed 
yield a straight line. 

Thus, In n (t) is a linear function of 
time, or 

\n n (t) — a bt , (8.2.1) 

which means that n ( t ) — e a+bi = 
e a e bt = ce bt . The quantity b turns 
out to be negative on the graph, b = 
— co, where co is the probability of 
disintegration. Thus, experiments con- 
firm the basic result of the preceding 
section and enables us to determine co 
by computing the tangent of the angle 
of inclination of the straight line 

(8.2.1) to the time axis. 

Actually, this is a very remarkable 
result. 8 - 3 Imagine N 0 radioactive atoms 


8 * 3 Niels Bohr, speaking on radioactive 
transformations, said in this connection: 
“The meaning of the discussions on the mean 
lifetimes of atoms without any indication of 
a definite instant of time lies in the fact that 
the atoms, so to say, do not grow old until 


“manufactured” simultaneously at time- 
t = 0. They are all prepared in the- 
same fashion and at the same time. We 
know that radioactive atoms are un- 
stable and are capable of disintegrat- 
ing. We can suppose that the disintegr- 
ation of the atoms requires a definite 
time. Imagine that after the atoms are 
ready, a certain time must elapse before 
they are mature enough to disinte- 
grate. But then we should expect all 
the atoms to mature in the same period 
of time and, at the expiration of that 
time, to disintegrate simultaneously. 
Imagine, further, that we have models 
of guns with stretched springs and gears- 
^Oi^lxokKwolVATm^auiicek /wia irxJneasu 
their shells when the gears (or clocks) 
reach a particular position (time). Fir- 
ing will be regarded as disintegration 
of the model. If all models are the same 
and manufactured at the same time, 
the shells should be fired after the lapse 
of an identical time interval. 

But this picture of model disintegra- 
tion (firing of the shells) has nothing 
whatsoever in common with the actual 
behavior of radioactive atoms. Though 
created at the same time, they disinte- 
grate at all imaginable times. Let us try 
to find out what percentage disinte- 
grates during a time less than the mean 
lifetime. From (8.1.2) we find that the 
rate of disintegration (the number of 
atoms that decay in unit time) is 
dNIdt = — a )W 0 e -o) * (it is clear that 
dN is negative). During time dt there 
will be 

dt = dN= — c oN 0 e~ at dt 

atomic disintegrations, while during 
the time from t = 0 to t = t the follow- 
ing number of atoms will disintegrate: 

7 

M— \ (nN 0 e~ (i)t dt — — N 0 e~ at \ o = 

o 

= A r 0 (1 — e~ (dt ) atoms. 


they begin to decay; this means that the 
probability of disintegration is the same for 
them at any time.” 
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Since co = 1 It, it follows that 

M — N 0 (l — -j) — 0.6SN 0 . 

Thus, 63% of the atoms will disinte- 
grate during a time less than t. Similarly, 
we compute that during the time from 
i to 2 1 23 % of the atoms disintegrate, 
and during the time exceeding 2 1 only 
14% of the atoms (the remainder). 

In Figure 8.2.1 we see two curves: 
one for the number of disintegrations 
in unit time for radioactive atoms (curve 
1) and for the gun models (curve 2). 
The gun-model curve has a certain 
width here. We can figure, perhaps, 
that the models were not quite exact 
and therefore did not fire off quite at 
the same time. The more precise the 
models, the narrower curve 2 in Fig- 
ure 8.2.1. 

The area under the curve represents 
the total number of disintegrated atoms 
for one curve and the total number of 
all models for the other curve. We can 
take the number of models to be equal to 
the number of atoms. Then both curves 
will have the same area. The abscissa 
of the center of gravity of both curves 
is also the same. 8 - 4 This means that 
we consider models in which the mean 
lifetime (prior to firing) is the same 
as the mean lifetime of the radioactive 
atoms. 

We have thus done everything in our 
power to make the curves similar: we 
took as many models and with mecha- 
nism such that the total number of 


8 - 4 As will be demonstrated in Section 9.12, 
this follows from Eq. (8.1.5). 


models and atoms and the mean life- 
times of the models and atoms are the 
same. And yet the curves are so strik- 
ingly different! Experimentation with 
radioactive nuclei irrefutably rejects 
the type of curve obtained for the 
models. The more accurate the experi- 
ment, the more precisely is the law (8.1.2) 
confirmed. 

We examined this system with models 
so that the reader would not accept as 
ordinary and natural the relationship 
(8.1.2) for radioactive decay and would 
have cause for surprise and curiosity: 
“Indeed, why does radioactive disinteg- 
ration proceed in this fashion?” 

What is the physical meaning of the 
probability of disintegration? A long 
time ago, at the beginning of the cent- 
ury, it was suggested that radioactive 
decay requires some kind of external 
action, say the entry of a particle from 
outside. Then we would imagine that 
one atom disintegrated earlier since 
it was hit by an incoming particle, 
while some other atom remained un- 
touched. But this hypothesis did not 
fit the facts which stated that radio- 
active disintegration proceeds at the 
same rate under all manner of condi- 
tions, irrespective of temperature, colli- 
sions of atoms among themselves, the 
action of cosmic radiation. Also, energy 
is strictly conserved in radioactive 
disintegration, which likewise rejects 
the idea of some kind of outside influ- 
ence. 

A second possible hypothesis is that 
at the initial time of creation of the 
radioactive atoms they were actually 
not quite alike and for this reason de- 
cayed at different times. This is in 
keeping with the clock-driven models 
with clocks set for different times. This 
hypothesis presumes that an exact 
knowledge of the state of every atom 
completely determines the whole sub- 
sequent history of the atom and, in 
particular, determines with exactitude 
when the given atom will disintegrate. 
If atoms disintegrate at different times 
after their creation, this means that 
the whole business was foreordained: 
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when created, the different atoms of 
one and the same radioactive substance 
were not exactly the same and the di- 
verse decay times were predetermined 
in the creation stage. 

This view does not hold water either. 
For each specific mode of generation 
of atoms of a radioactive element we 
should have a definite relationship 
between the rate of disintegration and 
time. Experiment refutes this supposi- 
tion. 

One and the same species of radio- 
active atom can often be obtained in 
a variety of ways: say, atoms of Mo" 
(molybdenium with atomic weight 99) 
are produced in nuclear reactors in the 
process of fission of uranium atoms. 
These same atoms were earlier obtained 
under the action of the nuclei of heavy 
hydrogen (deuterium) on the atoms of 
ordinary, naturally occurring, non- 
radioactive molybdenium. Experiments 
have shown that, irrespective of the 
mode of production of the atoms, the 
disintegration rate is given by formula 
(8.1.2) with a constant value of co, 
which characterizes the given species 
of atom. Consequently, it is precisely 
the basic equation dN/dt = — c oN that 
all experiments confirm, with the char- 
acteristic co of atoms of a given species 
quite independent of external condi- 
tions. 

This equation is pregnant with mean- 
ing: all radioactive atoms of a single 
species are identical. The probability 
of disintegration does not depend on 
how and when the atoms were obtained. 
One hundred freshly produced atoms 
disintegrate in exactly the same fashion 
as in the case where 10 6 atoms are 
generated, a time interval elapses such 
that 100 atoms are left, and we consider 
the fate of these 100 remaining atoms. 8 - 5 


8 * 6 Characteristic of the exponential func- 
tion is that any portion of the curve is similar 
to the whole curve. Indeed, let us begin reckon- 
ing time anew from time t v Denote by t the 
time reckoned from this instant: t = t — t ± , 
t = *!+ t. Then N = N 0 e~ (Ot = N 0 e~^ tl+X ^= 
iV 0 e'° )il e" cox = N 1 e‘ co ' r , where = N^* 1 
is the number of atoms at time t x . 


What is so remarkable in the fact 
that 100 atoms with a given atomic 
weight and a given number of electrons 
are the same? If these were nonradio- 
active atoms, there would indeed be 
no cause for surprise. But for radio- 
active atoms there certainly is cause 
enough when we recall that out of the 
100 atoms 63 disintegrate in time t 
and the other 37 disintegrate after t . 
What is strange here is that the disinte- 
gration time is different although the 
atoms are the same. 

It is not fruitless to wonder in this 
fashion. In the phenomenon of radio- 
active decay we already perceive cer- 
tain peculiarities in the laws of motion 
of atomic and nuclear particles that 
differ from the laws of motion of the 
bodies we are accustomed to in classical 
mechanics and ordinary life. These 
peculiarities are studied in quantum 
mechanics. All this is of course outside 
the scope of our book. Our aim is modest 
enough. It is to show that the necessity 
for elaborating radically new concep- 
tions that differ drastically from those 
of ordinary mechanics stems from the 
very simple facts about radioactivity 
that can be comprehended by any 
school child. To realize that the old 
conceptions were insufficient, it was 
necessary to doubt, to wonder, to be 
surprised. 

In his autobiography, Albert Ein- 
stein — the greatest physicist of the 
twentieth century— notes the surprise 
and wonder that he experienced when 
he first saw a compass and perceived 
the mysterious action of a magnetic 
force that passes through paper, wood, 
the earth and acts on the compass needle 
without any direct contact. He wrote 
that this wonderment served as a tre- 
mendous impetus for a further search. 
He wrote of curiosity, which he claims 
“the modern methods of teaching have 

Thus, the law of disintegration for the num- 
ber of particles N 1 = N 0 e(Otl remaining after 
the previous decay is exactly the same as the 
law of disintegration of freshly obtained 
particles. This is precisely what is asserted 
in the text. 
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all but stifled.” Einstein himself evinced 
an extraordinary capability for wonder- 
ment and he was able to derive inspi- 
ration and impetus for the creation of 
theories out of the most mundane facts 
of everyday life. For instance, under- 
lying the brilliant general theory of 
relativity is the simple fact that caused 
Einstein to wonder why different bodies 
fall with the same acceleration. 

Quite naturally, to wonder is not 
enough, and to merely pose a problem 
does not suffice. Einstein combined 
the ability to pose a problem and to 
solve it, which means mastering the 
requisite mathematical techniques. And 
yet, among a galaxy of outstanding 
scientists, Einstein is the celebrated 
physicist of the twentieth century be- 
cause of his capacity to wonder and pose 
a problem where others were not able 
to see anything out of the ordinary. 

Perhaps this analysis of radioactive 
decay will help the reader to see what 
depths of content are hidden behind 
simple facts and formulas. 

To conclude this section, we will 
illustrate the curves of radioactive 
decay obtained experimentally in 1955 
by Glenn Seaborg and his associates 
in the United States (Figure 8.2.2). 
They were first to observe Element 
No. 101 of the periodic table, to which 
they gave the name mendelevium (sym- 
bol Md) in honor of the great Russian 
chemist Dmitri Mendeleyev. 



Minutes after bombardment 

Figure 8.2.2 


In this study the 98th element, call ~ 
fornium , with atomic weight 252 was- 
irradiated with neutrons. A neutron is 
captured by the nucleus of a Ca 252 
atom, which transforms into a Ca 253 ' 
atom. Californium-253 ejects an elec- 
tron and turns into Element No. 99, 
which is called einsteinium (symbol Es). 
and has the same atomic weight 253. 

About 10 9 atoms of einsteinium (this 
is equivalent to 4 X 10 -13 gram) wera 
deposited on a gold plate and subjected 
to alpha-particle bombardment (a par- 
ticles are nuclei of helium) in a cyclo- 
tron. This generates Element No. 100, 
fermium (symbol Fm), in accordance’ 
with the reaction 

99 Es 253 + 2 He 4 = 100 Fm 256 + t p l , 

and mendelevium via the reaction 

99 Es 253 + 2 He 4 = 101 Md 256 + oft 1 . 

In this notation, each chemical symbol 
has a subscript indicating the number 
in the periodic table (that is, the number 
of protons in the nucleus). The super- 
script indicates the atomic weight 
(rounded to a whole number, which is the 
number of neutrons and protons in the 
nucleus). The helium nucleus, or the a 
particle, is denoted by 2 He 4 , the nu- 
cleus of a hydrogen atom, or simply the 
proton, by jp 1 , and the neutron by 
o^ 1 . In a nuclear reaction the sum of 
the subscripts on the left-hand side of 
the reaction equals that on the right- 
hand side, and the same is true of the 
superscripts, since in nuclear reactions 
all we have is an exchange of neutrons 
and protons between the nuclei. 

After the a particle bombardment, 
the gold plate together with the newly 
formed fermium and mendelevium was 
dissolved in acid, and then the fermi- 
um and mendelevium were extracted 
chemically. As Seaborg writes, it was 
the periodic table of Mendeleyev that 
permitted foreseeing the chemical pro- 
perties of an element that had never 
before existed in nature and had never 
been studied. After chemical separa- 
tion, measurements were made of the 
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radioactive decay characteristics. Fer- 
mium-256 (with atomic weight of 256) 
disintegrates radioactively with a 
half-life of about 3.5 hours. It breaks 
up (that is, fissions) spontaneously 
into two nuclear fragments of roughly 
the same mass (the details of the fission 
process are given in Section 8.3). 

The upper curve in Figure 8.2.2 
shows the number of nuclei of fermium 
as a function of time in this experi- 
ment. Plotted on the horizontal axis 
is the time in minutes. Plotted on the 
vertical axis is the number of atoms avail- 
able at a given time. 8 6 The scale on 
this axis is not uniform: the distance 
from the horizontal axis is proportio- 
nal to the logarithm of the number of 
atoms. In particular, the horizontal 
axis (y = 0) corresponds to one re- 
maining atom (In 1=0), while for 
zero atoms we have — oo on the ver- 
tical axis. The disintegration of each 
separate atom changes the number of 
atoms by 1; in between disintegrations 
the number of atoms is constant. In 
the experimental set-up in which each 
separate disintegration is recorded, we 
have a polygonal (step-like) curve in- 
stead of a smooth curve. On the step- 
like curve, each disintegration is asso- 
ciated with a vertical line connecting 
two steps. The upper straight line in 
Figure 8.2.2 corresponds to the decay 
law n = n 0 £“ t/T , where t = 77 In 2 ~ 
5 hours, T ~ 3.5 hours. It will be 
seen from Figure 8.2.2 that, in all, 
there were recorded 40 disintegrations 
of fermium. The more atoms there are, 
the closer is the polygonal line to a 
straight line. When there are fewer than 
five atoms left, then quite naturally 
the probabilistic nature of radioactive 
decay leads to appreciable deviations 
from the exponential law, which holds 
true for large number of atoms. 


8 - 6 It is not possible to count the number 
of atoms available at a given instant. What 
is recorded experimentally are the disintegra- 
tion events of the atoms. The number of atoms 
N at time t is computed after the exper- 
iment, when all N atoms have disintegrated. 


After chemical separation, the nu- 
cleus of mendelevium rapidly (in half 
an hour) captures an atomic electron 
and transforms into a nucleus of fer- 
mium. And so the precipitate contai- 
ning mendelevium also yields (when the 
radioactivity is measured) a disinte- 
gration of atoms into two fragments 
with a half-life of 3.5 hours. The decay 
curve for fermium obtained from men- 
delevium lies in the lower left-hand 
corner of Figure 8.2.2 Six disintegra- 
tions were experimentally observed. 
Special experiments demonstrated that 
these six atoms could not have appear- 
ed as a fermium impurity in the pre- 
cepitate being measured but most de- 
finitely had formed from the mendele- 
vium. 

Seaborg and his associates observed 
a total of 17 atoms of mendelevium in 
that series of experiments. 

The foregoing example is not very 
good as an illustration of how exactly 
the exponential law (8.1.2) holds true 
in radioactive decay (see also (8.2.1)). 
Experiments demonstrating the valid- 
ity of the exponential law were success- 
fully carried out with more common 
radioactive substances. On the other 
hand, the example of mendelevium and 
fermium shows what peaks of experi- 
mental technique modern physicists 
have attained in synthesizing new ele- 
ments and recording the disintegration 
of each separate atom. 

In Seaborg’s experiments, the 
counter recording mendelevium disinte- 
grations was hooked up to an amplifier 
in the loudspeaker system of the insti- 
tute and every disintegration event was 
heard by workers in the various labora- 
tories on different floors so they could 
celebrate the birth (actually the recor- 
ded death) of every atom of the new ele- 
ment created by man (true, before the 
work was over, the local fire department 
got interested in these goings-on and 
the disintegration news broadcast was 
stopped). 

In the Soviet Union the synthesis 
and investigation of the heaviest ele- 
ments are being successfully conducted 
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by Academician G. N. Flyorov and his 
associates at the Joint Institute for 
Nuclear Research in Dubna, a city 
near Moscow. 

8.3 Series Disintegration 
(Radioactive Family) 

In a number of cases, radioactive decay 
produces atoms that again decay ra- 
dioactively, so what we have is a chain 
of disintegrations: an atom of element 
A transforms into an atom of element B, 
which in turn disintegrates into an 
atom of element G, and so on. Let us 
consider the mathematical problem of 
determining the dependence on time 
of quantities of elements A, B, C and 
ways of solving the problem. We de- 
note the quantities of substances (ele- 
ments) A, B, C that have not yet de- 
cayed by time t by the italic letters 
A, B, C; thus, A = A (t), B = B (*), 
and C = C ( t ). 

Let the probabilities of disintegra- 
tion of A, B, G be equal to co, v, u , 
respectively. Then 

)A (8.3.1) 

dt 

(here, A is termed the parent element ). 
We write the equation for element B 
(the daughter element ). In unit time a 
total of vB atoms of element B disinte- 
grate. On the other hand, during the 
time we have c oA disintegrations of 
element A, and since each disintegra- 
tion of an atom of A given rise to one 
atom of B, there are coA, atoms of B 
formed in unit time. Therefore 

4^- — — vj5 + cdA. (8.3.2) 

Similar reasoning yields 

J£L=-uC+vB. (8.3.3) 

Equations (8.3.1)-(8.3.3) form a sys- 
tem of differential equations . In the 
given instance, we can solve these equa- 
tions one by one, having to deal each 
time only with one equation in one 


unknown. Indeed, neither B nor C 
enter into Eq. (8.3.1). From it we there- 
fore find that A (t) = A 0 where 

A 0 is the number of atoms of element A 
at the initial time t — 0. (We assume 
that B = C = 0 at t — 0). 

Substituting the expression for A ( t ) 
into (8.3.2), we get an equation involv- 
ing only one unknown function B ( t ): 

+ (8.3.4) 

How does one go about solving this 
equation? We can find a solution if we 
first consider the fate of the group of 
atoms of B that have formed during the 
same interval of time, from t to t + 
At. We will consider the number of 
atoms of this group that are still “alive”, 

A B (that is to say, atoms that have 
not disintegrated by time t ), as a func- 
tion of time t. So as to avoid confusion 
about the time t when we measure 
the number of atoms and the time of 
formation of the group, let us denote 
these times by different letters, t and 
t, respectively. At time t, the rate of 
formation of atoms of element B was 
c oA (t). During the small time inter- 
val At, a total of Ai? 0 = co^l (t) At 
atoms of element B were formed. 

How does the number of atoms in the 
group at hand depend on time t ? For 
t <Z t is it equal to zero: the atoms of 
interest have not yet formed since the 
group itself is still nonexistent, A5 = 
0. Let £ > t. Observe that a time 
t — t has already passed since the 
group began to form. The decay proba- 
bility of element B is v . Therefore, af- 
ter the lapse of time t — t from the 
time of formation of the group the num- 
ber of untransformed atoms will be 

A B (t) = A B 0 e - V ^-^ 

= coA (t) e~ v ^~ x) At. 

To find the total number of atoms of 
element B at time t , we have to add the 
number of atoms in all groups that formed 
prior to t. If we take At (and| hence 
AB as well) very small, then the sum 
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turns into the integral 


B 


<» = ! 


A B (t) 

At 


dx 


If we set A(x) = ^L 0 e _COT , we get the 
concrete solution 

= (8.3.6) 



e -v(t-x )dx. 


Observe that here the variable of 
integration is denoted by x. The argu- 
ment t, upon which B depends, enters 
into the integral twice: as the upper 
limit and in the integrand. When in- 
tegrating with respect to x the quantity 
t is to be regarded as a constant. We 
can therefore write 


g-v(t- T) — g-vtgVX 


and take (oe~ vt out from under the in- 
tegral sign as a factor that is indepen- 
dent of t. We then get 

t 

B (t) ■— we- vt ^A(x)e vx dx. (8.3.5) 
o 


The solution could also have been 
found without resorting to a considera- 
tion of the separate groups of atoms. 
Now that the solution has been found, 
it is already a simple matter to guess 
the mathematical technique that will 
lead us to our goal. The solution (8.3.5) 
is of the form 

B(t) = e - vi I(t ), (8.3.7) 


where / (£) stands for an integral that 
depends on t. We will seek the solu- 
tion in the form of a product of e~ vt 
by the unknown function I ( t ) and will 
set up an equation for / ( t ): 


dB d 

dt dt 


(e~ vt I) = —ver vt I + er vi 


dl 

dt 


(8.3.8) 

Substituting (8.3.8) and (8.3.7) into 
Eq. (8.3.4) yields 


It is easy to verify, without evaluating 
the integral, that the solution (8.3.5) 
satisfies the original equation (8.3.4) 
for any function A(x). Indeed, let us 
find the derivative dB ( t)/dt . By the 
rule of differentiating a product we get 

t 

~J~ — — (ove~ vt j A (t) e vx dx 

o 

t 

-\-(oe~ vt i(W> e vx dx^ . 

o 

Since, by the property of the derivative 
of an integral (see Section 3.3), 

t 

-jf ( j A (t) e m drj = A ( t ) e vt , 

0 

it follows that 

t 

—U)ve- Vt j A (x)e vx dx 

o 

+ (oA ( t ) = — vB + coA. 


e ~ vt ^r= (S)A ^' 

or 

= ® e vt A(t). (8.3.9) 

By hypothesis, at the initial time 
t = 0 we have B = 0, and hence I = 

0 at t = 0. With this initial condi- 
tion, the solution to Eq. (8.3.9) has 
the form 

t 

1 ( t ) = j a) e vx A (t) dx . 

o 

Thus, the final result is 

t 

B (t) = e~ vt I (t) - e~ Kt j co^l (t) e” dx. 

0 

(8.3.10) 

In this formula it is essential, so as 
to avoid confusion, to retain strict de- 
signations and not to denote the vari- 
able of integration x by the same letter 
we use for the upper limit of integra- 
tion, t. 
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8.4 Investigating the Solution for a 
Radioactive Family (Series) 

In the preceding section we brought to 
completion the solution of the problem 
in the case of two radioactive substances 
(elements). Let us now investigate this 
solution for two particular cases: 

(1) a short-lived parent element A 
and long-lived daughter element B, 

(2) a long-lived parent element A and 
short-lived daughter element B. 

Below, in addition to the decay prob- 
abilities co and v , we will make use of 
the mean lifetimes t A = 1/co _and 
t B = ilv. In the first case, where t A <C 
t B , the nature of the solution can be 
readily grasped without calculations 
and formulas. The entire process breaks 
down into two stages. First, when t 
is of the order of t A (here, by hypothe- 
sis, £ a <C tB and so also t<^t B in the 
first stage), element A is transformed 
into element B, while during this time 
there is hardly any disintegration of 
element B. During this period the 
amount of B is equal to the difference 
between the original amount A 0 and the 
amount A remaining at time t: 

B (£) = A 0 — A (t) = A 0 -A 0 e-«t 
= A 0 ( l — e-ot), £< t B . 

By the end of this period, practically 
the whole of element A has been convert- 
ed into B, and the quantity of B be- 
comes equal to the original amount of 
the parent element, A 0 . The quantity 
of element A becomes zero. Then fol- 
lows a slow and protracted disintegra- 
tion of B: 

B(t) = A 0 e~ vt , *>* A . 

We will show how these qualitative 
ideas follow from the exact formula. 
For the case of two radiactive elements 
A and B, we obtained in the preceding 
section the formula 

In our case £ A <C t B and co^>i;, and 
so it is more convenient to interchange 


the signs so as to be dealing with posi- 
tive quantities in the parentheses and 
in the denominator of the fraction. Then 

B(t)=A 0 -£ T (e-«-e-«). (8.4.1) 

Since v < C co, it follows that co/ (co — 
v) co/co = 1. 

We consider the expression e~ vi — 
e -(M f or two successive stages. First, 
when t<^ t B = l/v, it will be true that 
vt<^ 1. Then e~ vt c* 1. Since t can be a 
quantity of the order of t A and c o£, 
consequently, of the order of unity, it 
follows that e ~ must be calculated 
exactly. From formula (8.4.1) we get 

B(t)~A 0 ( 1— *-«*)• (8.4.2) 

In the second stage, when t^> t A =* 
1/co, it will be true that co£^ 1, 
We can disregard e -co< in this stage, 
since e~ oH is small not only with respect 
to unity but also in comparison with 
e ~ vt , because v <C co. We get 

B(t) = A 0 e- vt . (8.4.3) 

Thus, the exact formula does indeed 
yield the same results as those obtained 
from simple qualitative reasoning. 

Now let us take up the second case, 
that of the long-lived parent element A 
and short-lived daughter element B: 

t A > t B , co<i;. 

We consider the period when a time t 
considerably exceeding t B has passed 
since the onset of the process. In that 
case, element B that was formed at the 
beginning of the process has already ful- 
ly disintegrated by time t. Since B dis- 
integrates rapidly and in a short time, 
at every given instant of time there is 
only availabbe the quantity of ele- 
ment B that has recently formed. What 
we have here is a steady state (also 
known as a stationary state): element B 
is formed from A and straightway disin- 
tegrates; element B does not accumulate 
because it decays rapidly, but does 
not disappear completely because A is 
producing new quantities of B all the 
time. In a steady state system, there 


19 * 
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are just as many atoms of B disintegrat- 
ing in unit time as there are atoms of 
B being formed from A, so that B 
here changes but little. Mathematically, 
this condition is written as v B ~ oA, 
whence 

B(t)~^A{t) = 4?-A (t). (8.4.4) 

In the steady state the instantaneous 
quantity of B is proportional to the quan- 
tity of A and always represents some 
small fraction of A. This fraction is 
small because in the case at hand (the 
second case) t B <^~t A and hence tn/tA^ 
1, for otherwise there would be no 
steady state. 

1 How do we obtain the steady-state 
equation from the exact differential 
equation dBldt = — vB + coA? Evi- 
dently, if we take it that dBldt is small 
in comparison with each of the two 
terms on the right, replacing dBldt 
by 0 we approximately get 

0 = — vB + coA, or vB = oA. 

Let us now examine the beginning 
of the process. At t = 0 we have A = 
A 0 and B = 0. This means that at 
the beginning we do not have a steady 
state, since by steady-state formulas 
"we should first have 

(the subscript on B denotes a steady, or 
stationary, state). At t = 0, element B 
is forming at the rate dBldt = c oA 0 , 
while there is no disintegration of B 
at all at the initial time since B = 0. 

It is possible to determine the time 
t x during which, for the initial (con- 
stant) rate of buildup of B, the quan- 
tity 5 st will be attained. Indeed, if 
the rate of formation of B remains con- 
stant, equal to (dBldt) t = 0 , then 

B = t (-%. -) . 

\ dt } t = 0 

Putting 5 st = (c o/v) A 0 and (dBldt) t=0 = 
<»A 0 , we get the desired time 


Thus, the steady state is attained in 
a time roughly equal to the mean life- 
time of element B (we recall that our 
assumption that the rate at which sub- 
stance B is formed is valid only approx- 
imately). From the condition £ A <C 
t B it is evident that the quantity of 
element A changes but little during 
this time. 

On the whole, the approximate exam- 
ination in the case of a short-lived 
daughter element yields the following: 

B W = i~ir) t = 0 t=ti)A o t if *<^6, 

(8.4.5) 

B (t) —■ B 6t — AQe-®* = (titsAQe-®* if 
t > t B • 

We get the function B = B (t) in the 
form of two lines: first the straight line 
1 (Figure 8.4.1), or a linear function, 
then an exponential curve. Figure 8.4.1 
was constructed on the assumption that 
t A — 10l B ; instead of the exponential 
curve we depicted the straight line 2 . 
It is easy to verify that for t = t B 
the two formulas in (8.4.5) yield almost 
the same result. 

Let us see what the exact solution 
(8.4.1) to Eq. (8.3.4) gives us in the 
case at hand, where u^> co and ^ B <C 
t A . In the denominator we neglect 
co compared with v. For ut^> 1 we al- 
so neglect e~ vt in the parentheses. This 
yields 

B ~ A 0 -^e- at , (8.4.6) 

which is precisely the steady-state so- 
lution. 



Figure 8.4.1 
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We determine how close the solution 
is to a steady state by how fast e~ vt 
falls off. For very small values of t 
(when vt < 1, so that cot is sure to be 
small), we get the following by expand- 
ing e~ xi and e ~ in a series and confin- 
ing ourselves to the first two terms: 

B ~ A j (1 — co< — 1 + vi t) = 

(8.4.7) 

which also coincides with the approxi- 
mate result. Actually, however, the 
exact formula yields a single smooth 
curve without discontinuities and sali- 
ent points (the dashed curve 3 in Fig- 
ure 8.4.1). The approach of this curve to 
the steady-state solution depends on 
how fast e~ vt diminishes. For instance, 
for e~ zt to make a correction of the or- 
der of 10%, we must have vt 2.3, 
or t 2.3/v = 2.3£ B - Here, owing to 
the smallness of c ot, we assume that 
^ i Thus, true enough, the tran- 
sition from the stage of initial buildup 
to the stage where the solution is equal, 
with sufficient precision, to the steady- 
state solution takes place in a time 
of the order of the time of disintegra- 
tion, £ B . 

The example of a radioactive family 
(series) is highly instructive in the sense 
that obtaining a general exact solu- 
tion does not in the least signify that 
the work is at an end. The construc- 
tion of approximate theories for various 
limiting cases is an absolutely neces- 
sary part of the work and the existence 
of an exact formula does not at all take 
the place of an approximate theory. 
Approximate, yet clear-cut and picto- 
rial conceptions serve as a check on an 
exact formula. 

Also, approximate theories give us 
important new qualitative concepts 
such as that of the steady state. These 
can more easily be remembered and 
they possess a broader range of appli- 
cation than do the exact formulas. 
For instance, in the case of a radioactive 
family consisting of several genera- 
tions, A B C D, the exact for- 
mula is extremely unwieldy. But if 


t A is greater than all other times, 
t-B, t c , and £ d , then all the results re- 
ferring to the steady state are obtained 
just as easily as in the case of two ele- 
ments, A and B. 

Sometimes the easiest way consists 
in obtaining an exact solution valid 
for arbitrary v and co (in our case), from 
which we then (for z;<C co or v co) 
obtain, via mathematical manipula- 
tions, some simple approximate for- 
mulas for the two extreme cases. But 
this is not yet all! If a simple approxi- 
mate formula has been obtained in a 
simple yet long-winded manner via 
the general solution, then alongside 
this there should be another, simple, 
way of obtaining the approximate for- 
mula. One should always attempt to 
find simple pathways because there 
will invariably appear problems in 
which the approach to an exact solu- 
tion is insuperably complicated and 
only a simple approximate approach 
makes it possible to advance in the so- 
lution. 

In practical situations, exact for- 
mulas come up just as rarely as algebra- 
ic, trigonometric, or other equations 
with solutions in whole numbers, al- 
though most of the textbook problems 
lead to exact formulas, just as problem 
books for junior classes abound in equa- 
tions that can always be solved in 
whole numbers. 

Observe that the conceptions of ra- 
dioactive families account for the strange 
result of exercise 8.1.5 about the 
amount of radium in the past: radium 
is a descendent (true, not direct, but 
via a number of intermediate substances) 
of uranium-238. It is therefore not 
correct to regard the present-day supply 
of radium as the result of the decay of 
primordial radium. Actually, radium 
is in a steady state with uranium. From 
the equation 

5 = — A 

V 

we find that the quantity of radium 
B = 10 ~ 12 corresponds to the uranium 
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content 

A = 4A- 5 = 3 x iO e B = 3 x 10 -6 . 

We have approximately found the pres- 
ent-day amount of uranium-238 in 
rocks. The original abundance, 5 X 
10 8 9 years ago, was twice as much, of 
the order of 6 XlO -6 . These magnitudes 
are quite reasonable, unlike the 
results of the exercises in Section 8.1. 

8.5 The Chain Reaction in the Fission 
of Uranium 

In 1938, Otto Hahn and Fritz Strass- 
mann in Germany and Irene and Frede- 
ric Joliot-Curie in France demonstrat- 
ed that when a neutron enters a nu- 
cleus of uranium, fission occurs in which 
the nucleus breaks up into two large 
fragments with the simultaneous emis- 
sion of two or three new neutrons. Ura- 
nium with an atomic weight of 235 
(uranium-235 for short) is very active 
in this respect. Naturally occurring 
uranium contains about 0.7% of ura- 
nium-235 atoms and 99.3% of uranium- 
238 atoms. 8 - 7 The fission fragments of 
uranium-235 are medium atomic-weight 
nuclei from 75 to 160. The charge 


8 - 7 This was followed in 1939, in the labo- 
ratory of I.V. Kurchatov in Leningrad, by the 
demonstration (carried out by the Soviet 
scientists G.N. Flyorov and K.A. Petrzhak) 

that uranium-238 is capable of undergoing 
spontaneous fission without the entry of any 
neutron, althought the probability of this 
event is extremely low. The probability of 
radioactive decay (with the emission of an a 
particle) of uranium-238 corresponding to 
a half-life of 4.5 X 10 9 years is a) = 5 X 

10 -18 s _1 , while the probability of the sponta- 

neous fission of uranium-238 is lower by a fac- 

tor of 10 6 that is, it is equal to about 5 X 

10~ 24 s -1 . Thus, in one second in one kilogram 
of uranium (which is about 2.5 X 10 25 atoms) 
there occur roughly 10 7 radioactive disintegra- 
tions and only 10 events of spontaneous fission. 
On the other hand, in the very heaviest ele- 
ments, spontaneous fission becomes the most 
probable decay process (see the end of Sec- 
tion 8.2, in particular Figure 8.2.2, where the 
decay curve of mendelevium is given). The 
roblem of the chain reaction that we consider 
elow does not involve spontaneous fission 
at all. 


of these nuclei lies within the range 
from 35 to 57, the sum of the charges of 
two fragments always being equal to 
the charge on the nucleus of uranium, 
that is, 92 elementary charges. The 
sum of the atomic weights of the two 
fragments is equal to 235 + 1 — v, 
where 235 is the atomic weight of ura- 
nium-235, 1 is the atomic weight of 
the neutron that caused the fission, 
and v is the sum of the weights of the 
neutrons generated in the act of fission. 
An enormous energy of 6 X 10 10 J/g 
(per gram of fissioned uranium) is re- 
leased in the fission process. Thanks to 
this great energy, right after the fission 
process the fragments rush apart at 
speeds of about 10 9 * cm/s; then they de- 
celerate and their kinetic energy is 
transformed into heat. 

The source of this energy is the elec- 
tric repulsion of two like-charged frag- 
ments. Before the nucleus is separat- 
ed into two parts, the nuclear forces 
between the particles that make up the 
nucleus balance the electric repulsive 
forces. But as soon as the nucleus has 
broken up into two separate fragments, 
the repulsion of these two fragments is 
not countered in any way and so they 
fly apart at high speed. The fragments 
are very quickly brought to rest in a 
dense substance. Their time of flight 
is between 10“ 13 and 10 -12 s. In this 
time they traverse distances from 10 -4 
to 10~ 3 cm. The kinetic energy of the 
fragments is converted into heat. The 
neutrons produced in fission have velo- 
cities of about the same order as the 
fragments (about 2 X 10 9 cm/s). 

Of crucial importance for the practi- 
cal utilization of the energy of nuclear 
fission is the fact that a fission event 
caused by one neutron gives rise to more 
than one neutron. It is quite clear that 
if the neutrons do not leave the system, 
their number will increase in geomet- 
ric progression with time, that is, in 
accordance with the law of the exponen- 
tial function. The rate of energy release 
will build up by the same law, in 
proportion to the number of neutrons. 
And even if at the onset of the process 
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there were few neutrons, their number 
builds up so fast that the energy will 
be released at a rate convenient for 
practical use (for instance, as a source 
of energy for a nuclear power plant), and 
in just a short additional space of time 
the energy release will build up to such 
an extent that an atomic explosion will 
take place. In reality, part of the neu- 
trons leave the system, some are cap- 
tured by other nuclei without causing fis- 
sion. We can utilize this to control 
the number of neutrons and, in a parti- 
cular case, attain a steady-state system 
in which the number of newly formed 
neutrons in unit time is equal to the 
number of used up neutrons, so that 
the number of neutrons in the system 
remains the same in the course of time, 
and the energy can be released at a con- 
stant rate. That precisely is the regime 
we need if atomic energy is to be used 
for peaceful purposes. 

Our immediate task is to set up and 
investigate the equation describing the 
number of neutrons is a system as a 
function of time. This will be done in 
the sections below. 


8.6 Multiplication of Neutrons 
in a Large System 

Let us first derive an equation for the 
variation in the number of neutrons 
with time in a very large system (say, 
in a large chunk of uranium-235), 
when loss of neutrons to the outside 
can be neglected. 8 * * - 8 The neutrons can 
all be regarded as having the same 
speed; we denote it by v. 

Fission of a nucleus occurs in roughly 
half of all cases when a neutron enters 
a nucleus of uranium-235. In the other 
half, the neutron emerges leaving the 
nucleus in the same state, with the num- 
ber of neutrons remaining unchanged. 
The uranium nucleus is a sphere of ra- 
dius R of the order of 10 " 12 cm. 


8 . 8 We will consider the simplest case ©f 

metallic uranium -235 without graphite mod- 

erator. 



How often will a neutron in flight 
inside the metal hit a nucleus of ura- 
nium? 

In a small time interval dt a neutron 
traverses a distance of vdt. Let us pic- 
ture a cylinder whose axis is the route 
covered by the neutron; the radius of 
the cylinder is equal to the radius R 
of the uranium nucleus. The neutron 
collides with the nuclei whose centers 
lie inside the cylinder. If this is so, 
the path of the neutron will pass at a 
distance less than R from the center of 
the nucleus, and so the neutron will 
hit the nucleus and enter it. The vol- 
ume of the cylinder is equal to Jii? 2 vdt . 

Figure 8.6.1 clarifies these ideas. The 
path of a neutron is shown by a dashed 
line along the cylinder’s axis. In the 
case A the center of the nucleus lies 
inside the cylinder, with the result 
that the neutron hits the nucleus, while 
in the case B the center lies outside 
the cylinder and the neutron does not 
hit the nucleus. 

There are N atoms in a unit volume 
of metallic uranium and hence N 
nuclei (the dimensions of N are cm -3 ). 
Therefore, in the volume nR 2 vdt that 
interests us there are NnR 2 vdt nuclei. 
There will be just as many events of a 
neutron hitting a nucleus during the 
small time interval dt . Not every neu- 
tron hit makes a nucleus fission. Let a 
be the portion of cases a neutron hit- 
ting a nucleus causes fission (a ~ 1/2 
in the case of uranium-235). Then the 
number of fissions during time dt 
is equal to NanR 2 vdt . 

The quantity a nR 2 , which has the 
dimensions of area since a and jc 
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are dimensionless, is called the cross 
section of fission and is denoted by 
a f (the subscript f on the Greek letter 
sigma stands for “fission”). 

If there are n neutrons in the bulk 
of metallic uranium, the number of 
fissions in time dt is equal to 

nNot vdt. (8.6.1) 

Each act of fission produces v neutrons, 
but this involves the absorption of one 
neutron, so the number of neutrons in 
every fission event increases by v — 1. 
Associated with the number of fissions 
(8.6.1) is the variation in the number of 
neutrons 

dn = nN (v — 1) Of vdt. (8.6.2) 

From this equation we get 

-~- — nN (v— 1 )p,v. 

Set 

N (v — 1) G{V = a. (8.6.3) 

Then 



We already know that the solution to 
this equation is 

n (t) = n 0 e at , (8.6.4) 

where n 0 is the number of neutrons in 
the system at t = 0. 

To summarize, then, if the number 
of neutrons in a system varies solely 
because of fission, then the number of 
neutrons increases in geometric progres- 
sion , while time increases in arithme- 
tic progression . 

Indeed, if we take a number of equal- 
ly spaced intervals of time, 

£i, 4" ti 4“ 2Af, ti -f- 3 A £, . . ., 

(8.6.5) 

then the corresponding number of neu- 
trons is 

n i =^n 0 e at ^ fn u f 2 n u fn lf ..., (8.6.5a) 
where / = e aAt . 

This way of describing the process, 
growth in the number of neutrons in 
geometric progression, is common in the 


popular literature (see footnote 4.14). 
Physicists and engineers speak rather 
of an exponential growth (in accord with 
the law of exponential increase) and 
rarely use (8.6.5) and (8.6.5a). The ex- 
ponential law (8.6.4) is characterized 
by the growth rate a . 

Let us find the dimensions of a . 
In (8.6.4) at is a dimensionless quantity 
and, consequently, the dimensions of 
a are s -1 . The same result can be ob- 
tained if we recall that 

a = N (v — 1) OfV, 

where N is measured in cm -3 , cr f in 
cm 2 , and v in cm/s. 

Let us find the approximate value of 
the constant a. The density of uranium 
is roughly equal to 18 g/cm 3 . The num- 
ber of nuclei per cubic centimeter, A, 
can be calculated by recalling Avogad- 
ro’s number, which is equal to 6 X 10 23 
atoms in one gram-atom of any 
substance. Hence, 235 grams of ura- 
nium-235 contain 6 X 10 23 atoms, or 
6 X 10 23 nuclei. One cubic centimeter 
of uranium-235 contains (18/235) X 
6 X 10 23 ^ 4 X 10 22 nuclei, that 
is, N ~ 4 X 10 22 1/cm 3 . Substituting 
the mean value v ~ 2.5, v ^ 2 X 
10 9 cm/s, and a, — (1/2) n (10~ 12 ) 2 ~ 
1.6 X 10“ 24 cm 2 into the expression 
for a, we get 

a~ 4 x 10 22 x 1.5 x 1.6 X 10" 24 
x2x 10 9 ~2x 10 8 s" 1 , 
or 

a" 1 ^ 5 X 10 -9 s. 

To summarize, if the neutrons do not 
leave the system, their number in- 
creases by a factor of e in approximately 
5 X 10" 9 second. At this rate of build- 
up, in one microsecond, or 10 -6 s, 
the number of neutrons has increased 
by a factor of e 2x]08xi0 ~ 6 = e 200 , that 
is, approximately, 10 0 - 43x20 ° = 10 86 . 

One metric ton of uranium-235 con- 
tains roughly 2.5 X 10 27 nuclei. If the 
neutrons do not leave the system, this 
quantity of uranium will fission in less 
than one microsecond. This process is 
an explosion of enormous force. 
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Such a rate of buildup is not permis- 
sible if we want to use the fission pro- 
cess for generating electric power. It is 
necessary that neutrons leave the sys- 
tem and thus reduce the rate of neu- 
tron buildup. 

8.7 Escape of Neutrons 

Picture a mass of uranium-235 in the 
form a sphere of radius r. We have to 
setup an equation for the variation of 
the number n of neutrons inside the 
sphere. Assume for the sake of simplic- 
ity that the sphere is fixed to a thin 
support so that it is surrounded by a 
complete void and a neutron that has 
left the sphere will never enter it 
again. 

How can we determine the neutron 
flux (the number of neutrons leaving 
the sphere in unit time)? We make a 
rough calculation. Consider a small time 
interval dt. During this time each 
neutron covers a distance of vdt. Where 
are the neutrons that leave the sphere 
in time dt ? Evidently, they will have 
to be inside the sphere in a thin layer 
adjacent to the surface of the sphere but 
at a distance not exceeding vdt from 
the surface, otherwise during time dt 
they will not reach the surface, cross 
it, and leave it for good. But neither 
will all those neutrons that are inside 
the layer of thickness vdt be able to 
leave in time dt since not all neutrons 
inside the layer have velocity directed 
outward along the radius. In our very 
rough calculation we will ignore this 
latter circumstance. 

How is it possible to find the number 
of neutrons in the layer? There are a 
total of n neutrons in the whole sphere. 
The volume of the sphere is V = 
(4/3) nr 3 , while the volume of the 
thin layer that interests us near the 
surface is approximately equal to Svdt 
if vdt is small. Here, S = 4nr 2 is the 
surface area of the sphere. 

The mean density of neutrons (the 
number per unit volume) is C = n/V. 
Suppose that the density near the sur- 
face in the thin layer does not differ 


from the mean density. Then the num- 
ber of neutrons in this layer is 

CSv dt — ^y-vdt. 


Therefore the flux (the number of neu- 
trons leaving in unit time) is 



nkicr 2 3 ^ 

(4/3) nr 3 ^ r 


Actually, the neutron density near 
the surface is less than the mean densi- 
ty and, what is more (this was noted 
above), the neutron velocities have 
different directions and not all the neu- 
trons leave the sphere. The neutron 
flux is thus less than we obtained: 



(8.7.1) 


where & is a numerical factor less than 
unity. Later on, in Section 8.10, we 
will compare our results with experi- 
ment and find that k is close to 0.3. 
If nuclear fission does not occur inside 
the sphere and no new neutrons are gen- 
erated, then for the number of neu- 
trons inside the sphere we get the equa- 
tion dnldt = — q or, using (8.7.1), 

dn 3 kv 

nr ~ r~ n - 

Setting 

-^- = 6, (8.7.2) 


we obtain 


dn 

dt 


— bn. 


The solution to this equation is famil- 
iar: 

n = n 0 e~ ht (8.7.3) 

The mean residence time of neutrons in- 
side the sphere is, by (8.7.3), 


i t 


1 _ r 

b 3 kv 


(8.7.4) 


If k = 0.3, then (8.7.4) yields t ~ r/v. 
Therefore, the mean residence time is 
roughly equal to the time during which 
a neutron moving at a speed of v trav- 
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els a distance equal to the radius r 
of the sphere. 

An exact consideration of the escape 
of neutrons requires extraordinarily 
laborious computations. It is very im- 
portant from the start of one’s studies 
to get used to approximate calculations 
of all quantities of interest. Exact cal- 
culations are frequently very involved 
and require quite a different range of 
knowledge, at times even the collective 
-efforts of many workers and the use 
of electronic computers, and so on. But 
does this mean that a student engaged 
in self-instruction should give up the 
desire to consider a problem? There al- 
ways exist simple, even though rough, 
methods (similar to the one just consid- 
ered) for an approximate approach to 
a problem. To stop short of an approxi- 
mate solution because the exact compu- 
tations are complicated is merely to 
hide one’s lack of courage. Very often, 
just such hesitancy is distructive of 
the first steps of a scientist or inventor! 

8.8 Critical Mass 

Up to now we have considered separate- 
ly two processes: the multiplication 
of neutrons without regard for their 
^escape and the escape of neutrons with- 
out regard for their multiplication. 

Let us now consider a system in 
which neutrons multiply and can es- 
cape. As we know, in unit time, a times 
n neutrons are formed and b times n 
neutrons escape from the system. Since 
the variation of the number of neutrons 
in unit time is dnldt, it follows that 

dn 7 

-— = an — bn, 
dt 

or 


with c — a — b . For a given initial 
number of neutrons n 0 , Eq. (8.8.1) has 
the solution 

n = n 0 e ct . (8.8.2) 

This solution leads to quite differ- 
ent results depending on whether c is 


positive or negative. Indeed, from 
(8.8.2) it is evident that when c is neg- 
ative the number of neutrons n falls 
off with increasing t , which means that 
n tends to zero as t ->■ oo. But if c is posi- 
tive, then n increases with t , that is, 
n grows without limit in the course of 
time. Only the effect of new physical 
factors not accounted for in Eq. 

(8.8.1) can halt the growth of n. 

Thus the value c = 0 is a critical 

value, for it separates the distinct 
types of solution with increasing and de- 
creasing number of neutrons. Since 
c = a — 6, for a given a we can speak 
of the critical value b CT = a, since 
c = a — 6 >> 0 for b <C b CT and c = 
a — b < 0 for b > b cr . The quan- 
tity a is determined by the properties 
of the fissionable substance: according 
to (8.6.2), a = Nv cr f (v — 1). The quan- 
tity b depends on the amount of 
fissionable substance taken: 

£ 3 kv 

r 

The concept is therefore introduced of 
the critical value of the radius r cr for 
which b = b cr = a. From (8.6.2) and 

(8.7.2) it follows that 3 kv/r CT = 
NvOf (v — 1), whence 

3 k 

r ° r— Not (v — 1) ’ 

The mass of the sphere of radius r cr 
is called the critical mass , m cr . It is 
clear that 

m cr = jnrl T p, (8.8.3) 

with p the density of the fissionable 
substance. 8 9 

For r > r cr (this is the same as 
m > m CT ), c > 0 and we have a multi- 
plication of neutrons. For r < r cr 
(m < mcr), c <Z 0 and the original 
quantity of neutrons (exponentially) 
diminishes. Suppose we have a sphere 


8 * 9 As before, we consider the mass of 
a fissionable material, say, uranium, in the 
form of a sphere. 
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of radius r. Its surface area S and vol- 
ume V are 

S = 4nr 2 , F = -|- nr 3 , 
whence 

S 4j ir 2 _ 3 

T ~~ (4/3) nr 3 ~7 ’ 

If r is small, this ratio is great. But if 
r is great, the ratio is small. No wonder 
that when the radius is small, that is, 
when the ratio of surface area to vol- 
ume is great, the neutron escape in- 
creases and the conditions for neutron 
multiplication deteriorate. It is surpri- 
sing how sharply the number of neutrons 
varies with b: if b > 6 cr , in a short 
time the number of neutrons becomes 
practically zero, irrespective of wheth- 
er b = 1.016 cr or b = 2 6 cr . If & <C 
& cr , the number of neutrons increases 
without limit both for b = 0.99& cr 
and for b — 0.5 b CT , although the rate 
differs. This is precisely why one speaks 
of the critical value of 6, the criti- 
cal value of r, or the critical value of 
mass. When above critical, the mass 
is said to b q supercritical, when less 
than critical, it is called a subcritical 
mass. 

In Figure 8.8.1 are given the curves 
n = n 0 e( a ~ b)t for a number of values 
of b. Let us construct the curves of n 
as a function of b for a few definite val- 
ues of time t. In the computations, a 
is taken equal to 2 X 10 8 s -1 . Figure 
8.8.2 shows the n versus b curves for 
t = 5 X 10" 9 s, t — 15 X 10~ 9 s, and 
t = 30 X 10" 9 s. 




Figure 8.8.2 

The curves corresponding to ( = 
15 X 10 -9 s and t = 30 X 10~ 9 s 
intersect with the vertical axis ( b = 0) 
at n = 20 n 0 and n — 400ft 0 , re- 
spectively. 

As is seen in Figures 8.8.1 and 8.8.2, 
the greater the time t , the more diver- 
gent are the n versus t curves in Figure 
8.8.1, the steeper are the n versus b 
curves in Figure 8.8.2, and the more 
sharply is the criticality of the value 
b = 2 X 10 8 s -1 (in this example) man- 
ifested. 

If we take t > 10 ~ 6 s, the n versus b 
curve cannot be distinguished from the 
vertical line b = b CT = 2 X 10 8 s" 1 ; 
n = 0 for b >> b cr and n — oo for 
b < b CT . 

8.9 Subcritical and Supercritical Mass 
for a Constant Source of Neutrons 

In the preceding section we considered 
the problem of the variation with time 
of the number of neutrons for a given 
initial number n 0 of neutrons. We now 
pose a somewhat different problem. 
Suppose at time zero ( t = 0) the number 
of neutrons is zero and a neutron source 
is switched on at this instant of time, 
emitting q 0 neutrons per unit time. This 
problem leads to the equation 

~- = cn + ? 0 , (8.9.1) 

with c = a — b . We seek the solution 
to this equation with the initial condi- 
tion n = 0 at t = 0. 


Figure 8.8.1 
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A method of solution was given in 
Section 8.3 for a similar problem. We 
give a brief review of the reasoning 
there. 

We seek the number of neutrons at 
time t =?£= 0. The entire time interval 
from 0 to t is partitioned into subinter- 
vals At. We consider one such subin- 
terval from t to t + At. During this 
time the source emitted q 0 Ax neutrons. 
If the source operated only during one 
subinterval of time At, then we would 
be dealing with the problem of the pre- 
ceding section with the initial number 
of neutrons n 0 = q 0 Ax , the only differ- 
ence being that these neutrons are emit- 
ted at time t = x and not at time t = 0. 
Therefore, instead of the solution 
n — n 0 e ct we would have the solution 
n = n 0 e c( ‘ t ~ x '> = q 0 At£ c( *~ t) (this so- 
lution refers to t > t; for t < t we have 
n = 0), since clearly it is precisely 
on the time that elapsed after the ini- 
tial number of neutrons was fixed that 
the number of neutrons depends, that 
is, in the given case, on the quantity 

t — T. 

Actually, however, the neutron source 
is in constant operation during the 
whole time from 0 to t, and so we have 
to add the contributions of all neutrons 
emitted by the source in the various 
sub intervals of time At, the sum of 
these intervals covering the entire 
time interval from 0 to t. Such a sum, 
given small subintervals At, is an in- 
tegral, and so 

t 

n (t) = j q 0 e c ^~ x) d%. 

o 

It is easy to evaluate this integral: 

n (t) = q 0 e ct j e~ cX dx = q$e ci — e~ cx * 

o 

= q 0 e ct ( e~ ct (e ct - 1) . 

(8.9.2) 

It is readily seen that this solution 
satisfies the equation 

-f - = ir [-t {gCt ~ 1} ] = «° eCt = cn +^ 



and the initial condition n — 0 at 
t = 0. 

The same formula (8.9.2) yields a 
solution for a positive or negative val- 
ue of c, though the shape of the n 
versus t curve is essentially different. 
For c > 0, that is, for a > 6, the expo- 
nent ct is positive, so that e ct quickly 
exceeds unity with increasing t. For 
large t and positive c we have 

n £:< — e ct . 

c 

Fc: c < 0 we have ct < 0, and so e ct 
becomes much less than unity with in- 
creasing t, and the values of n approach 
the number — qjc (this number is 
positive since c < 0). The n versus t 
curves are shown in Figure 8.9.1. 

Note the curious case of c = 0. 
If c = 0, formula (8.9.2) cannot be 
used directly. Expand e ct in a series: 

e ct = i-\- ct\ + -f .... 

Substituting this into (8.9.1) yields 

— <7o F Jr ~2 C ^' 2Jr 

This formula is valid for c = 0, too. 
We then have (c = 0) 

n (t) = q 0 t. (8.9.3) 

This result is also readily obtainable 
from (8.9.1). Indeed, Eq. (8.9.1) at 
c = 0 has the form dnldt = q 0 , whence 
n ( t ) = q 0 t+ A, where A is the con- 
stant of integration. For t = 0 it must 
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be true that n — 0, whence A — 0 and 
we arrive at (8.9.3). 

As was shown above, when c < 0 
the concentration of neutrons in the 
course of time attains a constant value 
qjc or, what is the same thing, 
qj\ c | . The smaller the value of 
| c | (the closer we are to the critical 
state), the greater this constant value. 
Thus, for a very weak source (small 
q 0 ), a mass close to critical can yield 
an arbitrarily large number of neutrons, 
a large number of fissions, and a great 
quantity of energy. Such in principle 
is the mode of operation of nuclear re- 
actors. 

The maintenance of such a regime is 
no easy task, since small variations in 
b and c drastically alter the magnitude 
of qjc when c is close to zero, and opera- 
tion at c close to zero is necessary if we 
wish to obtain a big power output for 
small q 0 . However, this engineering 
problem can be solved by means of au- 
tomatic control: when n gets out of 
bounds, the control system changes a 
or b . Besides, there are also natural fac- 
tors that facilitate control: for instance, 
when n increases, the temperature 
of the active material rises and then 
it turns out, c diminishes, so that to a 
certain extent the system is self-regu- 
lating. 

8.10 The Critical Mass 

We now know how sensitive the prop- 
erties of a system are depending on 
whether we have a supercritical or sub- 
critical mass. Let us examine in more 
detail the condition of criticality 

3k 

rcr ~ Nat (v — 1) ‘ 

Substituting the numbers for uranium- 
235, Of ~ 1.6 x 10 -24 cm 2 , v ~ 2.5, 
N ~ 4 X 10 22 1/cm 3 , we get (in centi- 
meters) 

r ° r “ k 4 X 10 22 X 1 .6 X 10 -24 X 1 .5 — 30 *’ 

We do not know how to determine 
the coefficient fc, all we know is that 


it is less than unity. Let us find this 
coefficient by comparing the formula 
with experiment. Experiments show 
that the critical mass of uranium-235 
is about 50 kg. A uranium sphere weigh- 
ing 50 kilograms has a radius of about 
8.5 cm, so, in the given case, 

k ^ — ■ /-s - / 0 3 

30 ~~ 

Let us examine the physical signifi- 
cance of the formula for the critical 
radius. The neutron velocities cancelled 
out in the expression of r cr , which 
means that the formula for r cr can be 
obtained without regarding the course 
of the process in time and without exam- 
ining the rate of neutron multiplica- 
tion and the rate of neutron escape from 
the system. 

If we disregard the dimensionless 
factor 3 k (it is of the order of unity), 
the formula for the critical radius be- 
comes 

r cr-W a l — _ ^~T ' (8 ' 10 - 1) 

What is the quantity on the left? The 
volume of a cylinder of height equal to 
the radius and with area of the base 
equal to o* f is r cr cr f . Recall that if a 
neutron is in motion along the axis of 
such a cylinder, then it causes fission 
of those nuclei of uranium-235 whose 
centers lie inside the cylinder. Since 
N is the number of nuclei in unit vol- 
ume, we conclude that Nr Cr cr f is the 
mean number of nuclei in the volume of 
the cylinder. 

We can now give a different state- 
ment of the criticality condition. Ear- 
lier, we learned that the mean path, 
inside a fissionable material, of a neu- 
tron born inside the material (via fis- 
sion) is of the order of the radius r. 
After a neutron has traversed a distance 
of about r, it leaves the fissionable 
material and is lost to the process. 
The criticality condition means that, 
on the average, prior to leaving the sys- 
tem, a neutron should produce one 
neutron over this distance. In fission, 
v — 1 new neutrons are generated. Hence, 
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it is necessary that the neutron, 
prior to escape, produce approximately 
l/(v — 1) fissions, that is, that there 
be roughly (l/(v — 1) nuclei in the 
volume of the cylinder of volume ra f . 
This is the condition that leads to for- 
mula (8.10.1). 

Quite naturally, these arguments are 
not rigorous, but they are necessary 
for an understanding of the physical 
essence of the matter and cannot be re- 
placed by any kind of calculations, even 
the most precise ones performed on mod- 
ern electronic computers. Computer 
mx^ci/odx tfcfixwfo dm nntt 


but merely supplement a clear-cut grasp 
of the qualitative physical aspect of 
the matter. In particular, the reader 
should pay special attention to the 
principle expressed at the beginning 
of the section: if some quantity (z;) 
enters into the derivation of a formula 
but is cancelled out in the final result, 
this means that there is a derivation of 
this formula that dispenses altogether 
with that quantity. And one should al- 
ways find that simpler derivation be- 
cause a different derivation of a formula 
is tantamount to a fresh view of the 
qy/uvsvssb hranig . 
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9.1 Force, Work, and Power 

The relations existing between the most 
important quantities of mechanics ad- 
mit of exact formulations only by 
means of integrals and derivatives. In 
Chapter 2 we examined the relationship 
between the distance covered by, or 
the position of, a body, z, and its veloc- 
ity, y, and also between the velocity 
v and the acceleration, a, precisely, 
v = dz/dt and a — dvldt . We now go 
on to examine the relationships between 
quantities such as force, work, ener- 
gy, and power. 

Let us consider the rectilinear motion 
of a body along the x axis. Suppose a 
force F acting on the body is also direct- 
ed along the x axis. The work A 
performed by this force is defined as the 
product of the force F by the distance 
traversed by the body, l = b — a, 
where a is the initial position of the 
body and b is the terminal position: 

A = FI = F (b - a). 

Obviously, the situation is the same 
as in the case of the relationship be- 
tween velocity and distance, that is, 
the simple formula— work is equal to 
the product of force by distance — is 
valid only for the case where the force 
is constant . Now if the force varies 
during the process of the body’s trans- 
lation, the whole process has to be par- 
titioned into separate small intervals 
(subintervals) so that over every subin- 
terval the force may be taken to be 
constant (for this to be true each subin- 
terval must correspond to a small in- 
crement of time or distance). Then for 
a small segment A x\ of the translation 
corresponding to the ith subinterval 
(from position of the body to posi- 
tion x i+1 ) the work is 

AAi = FiAxi = Fi (x i+1 — x\). 

This means that in the general case of 
a variable force F == F (x), the work is 


expressed not as a product but as an 
integral: 

b 

A = j> dx. 

a 

We assume as known the motion of a 
body given by a known function x — 
x ( t ). The translation dx of the body 
during a small time interval dt is equal 
to the product of the (instantaneous) 
velocity v by the time dt: 

dx = v dt — dt. 

dt 

Therefore, the expression for work can 
he written thus: 

3 3 

A = \F^dt=\Fvdt, (9.1.1) 

a a 

where the moments t = a and t = fi 
correspond to beginning and end of the 
body’s motion. 

The product Fv in this formula is 
the work performed in unit time and is 
called the power . Indeed, in the case of 
constant velocity and force, the dis- 
tance is equal to x = vt, the work is 
A = Fx = Fvt , and the ratio of work 
to the time elapsed (that is, the work 
performed in unit time, or power) is 
Alt = Fv. Denoting power Alt by W, 
we can write 

P 

A = f W dt. (9.1.1a) 

j 

a 

Recall that in the SI system of units 
the unit of velocity is measured in m/s 
and the unit of acceleration is measured 
in m/s 2 . The unit of force has a special 
name, the newton (denoted N), which is 
the force that imparts an acceleration 
of 1 m/s 2 to a mass of 1 kg. Clearly, the 
unit of energy or work is 1 N • m — 

1 kg-m 2 /s 2 ; it is called the joule 

(denoted J). Finally, the unit of power 
is 1 N*m/s — 1 kg*m 2 /s 3 and is known 
as the watt (denoted W). 
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Figure 9.1.1. 

A body can be acted upon by sever- 
al forces simultaneously, say, F 1 and 
F 2 . Then we can speak of the work per- 
formed by the first force, A lf and that 
performed by the second force, A 2 , 
during the time that the body was trans- 
lated from the initial position a to the 
terminal position 6. Regarding forces 
F x and F 2 as constant, we obtain 

A x = (b — a) F x , A 2 = (b — a) F 2 . 

Note the signs of the quantities in 
these expressions. A force is taken to be 
positive when it acts in the direction of 
increasing x (Figure 9.1.1), while a 
force acting in the opposite direction (to 
the left) is regarded as negative. If the 
body is translated in the direction of 
the acting force, the work of that force 
is positive. But if the body is translated 
in the direction opposite that of the 
force, so that F x and b — a have differ- 
ent signs, the work A of the force is 
negative. Now picture two forces acting 
on a body (Figure 9.1.2a): the force 
F 1 of a stretched spring and the force F 2 
of the tension of a rope which you (the 
reader) hold in your hand. F 1 acts left- 
wards, F x C 0, while you are pulling 
rightwards, F 2 > 0. If you pull with 
more strength (which means that the 
absolute value of the force with which 
you pull rightwards is greater than the 
absolute value of the force with which 



( 6 ) 


the spring pulls the body leftwards, or 
I F 2 | > | F 1 |), the body, which ini- 
tially was in the state of rest, will 
move from left to right. Figure 9.1.2a 
shows the initial position of the body 
and Figure 9.1.26 the terminal posi- 
tion: (6 — a) > 0 and F x < 0. The 
work A x performed upon the body by 
the tension force of the spring, or more 
briefly, the work of the spring, in this 
translation is negative, while the work 
which you have performed is positive, 
A 2 > 0. The total work A = A 2 + A x 
is also positive, but A <C A 2 since 
A 1 C 0. This means that only part 
(A) of the work performed by you (A 2 ) 
was received by the body, the other 
part ( | A x | ) having gone into stretch- 
ing the spring. Observe that in all 
cases the force of friction against a 
stationary surface is directed against 
the velocity of motion of the body, and 
so the work of the friction force against 
a fixed surface is always negative , 
irrespective of the direction of the mo- 
tion of the body. 

The force F x with which a spring, 
one end of which is fixed, acts on a body 
differs in one very important way: 
it depends exclusively on the position 
of the body. Not all forces, by any 
means, have this property. For example, 
the force of friction between a moving 
body and a fixed surface always retards 
the motion of the body: it is directed 
leftwards if the body is in motion 
rightwards, and it is directed rightwards 
if the body is in motion leftwards. Thus, 
the direction of the force of friction de- 
pends on the direction of motion of the 
body. Besides, the force of friction can 
depend on the magnitude of the veloci- 
ty of the body. Thus, the force of fric- 
tion depends on the magnitude and 
direction of the body’s velocity and 
not only on the position of the body 
(in fact, it may even be independent of 
the position of the body). 

The force F 2 with which you pull 
the rope in the example of Figure 9.1.2 
can vary in any fashion, at your plea- 
sure. The body can, say, move to the 
right and then to the left. In so doing 


Figure 9.1.2 
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Figure 9.1.3 


it will twice pass through the same po- 
sition: the first time in the rightward 
movement at the second time on the 
return route at time 1 2 . 

A possible graph of the motion of the 
body (the dependence of the x coordi- 
nate on the time t) is shown for this case 
in Figure 9.1.3. We can, at our will, 
at time t x pull the body to the right, 
F 2 (£i) > 0, and at time t 2 let go of the 
rope so that F 2 (f 2 ) = 0 or even push 
the body leftwards so that F 2 (£ 2 ) < 0. 
But x (*x) = x (t 2 ) = x x and so, speak- 
ing generally, an arbitrary force F 2 
cannot be regarded as a function of the 
x coordinate. 

The foregoing examples of the force 
of friction and the force applied by a 
person acting on his own free will serve 
to demonstrate that the dependence 
of force solely on the position of a body, 
F 1 = F x (x), which is characteristic of 
the force F x with which a spring acts 
on a body, is not a general property of 
all forces, but is a particular property 
associated with the elasticity of a 
spring. 

To find the work A t performed by a 
given force F t (where the subscript i 
shows that we are speaking of one of 
the several forces, F ly F 2 , • . acting 
on the body) one must use one of the 
formulas 

b 3 

A t = j F t dx or A t = \ F t v dt. 

a oc 

We have to know two things: (a) what 
the motion of the body was, that is, 
the dependence of the x coordinate of 
the body on time t , or x = x ( t ), and 


(b) the expression for the force F t — 
F t {x , t, u), which in general depends 
on x , t , and v. 

Knowing the functions x ( t ) and 
v (£) and substituting them into the 
expression for F t ( x , t, v), we arrive at 
an expression for F t as a function of 
time and we can describe the work as 
an integral with respect to time. Note 
that other forces may also be acting on 
the body, with the result that each sep- 
arate force F i cannot be expressed in 
terms of the acceleration and mass of 
the body. 

Example . Suppose we have a force 
F (x) = — kx acting on a body and 
let the motion of the body be given by 
the equation x = B sin oo£, that is, 
F (, x , t) = — kB sin ayt and v = dxldt = 
Bcocoscot; we do not require that 
the force F be equal to the mass of the 
body multiplied by the acceleration: 
it is assumed that other forces Q act 
on the body, which together with F 
ensure that the motion of the body obeys 
the given law x = x (t). The work 
performed solely by force F can easily 
be found: 

*2 

A -= —B 2 k(jy ^ sin co£ cos co£ dt 

ti 

H 

B 2 k f* . 0 . B 2 k 0 _ 1*2 

= — o — \ sin Zco£ dt = — v— cos Zcotl 
2 J 4 In 

1 1 

~ 4 ( cos 2coi 2 — cos 2 g)£ 1 ) (9.1.2) 

(verify this). 

In this case, where the force depends 
solely on the coordinate, it is much eas- 
ier and convenient to take advantage 
of the expression of work as an integral 
with respect to x: 

b b 

A = [ F (x) dx — fc j x dx 

a a 

kp , 2 kb 2 

~~~2 2 ~' 


20-0946 
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Substituting x = 5sinG)£, we can al- 
so easily obtain an expression for work 
over a specified interval of time from 
^ to t 2 , 

A _ kB 2 sin 2 (04 kB 2 sin 2 co t 2 ^g ^ 2^ 
2 2 

It is easy to see that this expression co- 
incides exactly with the preceding one 
since 

cos 2 cd£ = cos 2 cd£ — sin 2 cot 
= 1 — 2 sin 2 a)t , 

whence 

cos 2 co^ 2 — cos 2 co^i 
= 1 — 2 sin 2 co ^2 — (1 — 2 sin 2 co^) 

= 2 (sin 2 co^i — sin 2 co£ 2 )* 

Substituting this identity into (9.1.2), 
we arrive at (9.1.3). 

A good deal of caution is required 
when using the expression for work 
as an integral with respect to the x 
coordinate of a force F (x, v , t) de- 
pending, generally, on x , v, and t. 
Indeed, in principle, if the motion 
x — x (t) is given, this equation can 
be solved for t and we can determine 
t (x). But one must bear in mind that 
t may not be a single-valued function of 
x , that is, one and the same position 
x may correspond to two distinct in- 
stants of time, which means that one 
and the same value of x is associated 
with two distinct values of t (see Fig- 
ure 9.1.3). Then the overall motion has 
to .be divided into separate periods du- 
ring which the velocity does not change 
sign and t is a single-valued function 
of x. But for different periods, t is ex- 
pressed by unlike functions of x. 

For example, let a body be moving 
via the law x = B sin co£, as in the pre- 
ceding example, but the force be given 
as a function of time, F = / cos co t. 
The force is then not a single-valued 
function of position x. Indeed, let 
t = 0, then x — 0 and F = /. But if 
we put t = Jt/co, again x = 0 but 
F = — so that the body will be in 
the same position x = 0 . at different 
times (t = 0 and ^ = Jt/co) though the 


force will not be the same. This diffi- 
culty can be avoided when integrating 
with respect to time, since to every in- 
stant of time t there corresponds one 
definite value of the x coordinate, of 
the force F , and of all other quantities. 

It is easy to find the work by integrat- 
ing with respect to time: 

t2 *2 

A = j Fvdt= j f cos at Bo* cos cot dt 

ti 1 1 

*2 

— fBc o j cos 2 at dt. 

tl 

Let us take advantage of the above tri- 
gonometric formula cos 2cp = 2 cos 2 cp — 

1 or, which is the same, cos 2 cp = 
1/2 -f- (cos 2qp)/2. Then 

A = fB«, 

tl 

= fB(x) (t 2 — *i) + -j fB (sin 2co t 2 
— sin 2co^). (9.1.4) 

Motion in accordance with the law 
x = B sin a)t represents oscillations of 
the body (see Chapter 10). As evident 
from (9.1.4), the work increases without 
bound with the passage of time, that 
is, as t 2 increases. This is due to reso- 
nance that occurs between the force and 
the oscillations of the body (resonance 
will be examined in detail also in Chap- 
ter 10). 

Consider the work performed by the 
force during one half-period, choosing 
for the initial time t x = 0, x x = 0, 
and the terminal time t 2 = ax/co, 
sinco£ 2 = sin n = 0, x 2 = 0. Then, in 
(9.1.4), sin 2 cd£ 2 = sin 2 co^ = 0, and 
the work is 

A =i-/ 5( o (9.1.5) 

The body has returned to its initial 
state, while the work performed by the 
force is not equal to zero but has a def- 
inite magnitude. How is this result 
to be understood from the viewpoint 
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2C2 

of the first formula A = j F dx ? At 

XI 

first glance, if we substitute x x = 
x 2 = 0, we get 
o 

A = j* F dx~ 0. 

o 

Actually, however, we have to consider 
separately the process of buildup of x 
from 0 to # max = B and the process of 
decline of x from x max = B to 0. Dur- 
ing buildup, each value of x is asso- 
ciated with a definite value of the force 
F, which we denote by F ± : 

F i — f cos (tit — f Y 1 “ sin 2 

During decline of x, the same positive 
values of x are associated with a 
( negative ) value of force. 91 • We denote 
this negative force by F 2 : 

F 2 {x)=-iy r \- i-f) 2 . 

Thus the integral with coordinate x 
for the variable of integration breaks 
up into two, 


is expressed in terms of x by different 
formulas for increase of x from 0 to 
B and for decrease of x from B to 0. 
In this case, F z (a:) = —F 1 (x). Sub- 
stituting the expressions for F 1 (x) and 
F 2 (x) into (9.1.6), we obtain 

A =>W ‘-( t ) 2 * 

0 

-/ J V 1 - (x)V- 

B 

In the second integral we can interchange 
the limits of integration (this changes 
the sign of the integral) to get 
b 

A = 2f \ ]/" 1 — (-i-) 2 dx. (9.1.7) 

o 

If we put z = x!B, then we have 
dx = B dz and 

i 

A = 2Bf j yT^dz. 
o 

But the integral 

l 

7= J] /i=z*dz = ZL 

0 


? H (the area of a quadrant of a circle of 

A= \ F it (x)dx+ \ F 2 (x) dx. (9.1.6) radius equal to unity), and so from 
o ill b (9.1.7) we get 


These two integrals cannot be combined 
by the formula 

b c c 

cp dx -J- ^ cp dx = h cp dx, 

aba 

since the integrands in the two inte- 
grals on the right-hand side of (9.1.6) 
are expressed by different formulas, 
although their meaning is the same 
(force). This is due to the fact that F 
is given as a function of t, while t 

9 * 1 The equation cos 2 g)£ + sin 2 G)£ = 1 is 
true for all values of co t. From this it follows 
that cos (tit = ±(1 — sin 2 ^) 1 / 2 , while the 
sign depends on the value of (tit . It is easy to 
see that for — ji/2<c (tit < jc/ 2 one has to take 
the plus sign and for ji/2 < (tit < 3ji/ 2 the 
minus sign, which was done above. 


A = 2Bf-%- = ~Bf, 

which coincides with formula (9.1.5) 
obtained by integrating with respect to 
time. 

Thus, in the case of a force that is de- 
pendent on time and can assume differ- 
ent values for the same value of x, 
the work A is not a single-valued func- 
tion of x either. In the foregoing case 
of oscillatory motion, F = / cos c ot, 
x = B sin (tit , the quantity x again 
and again passes through the same val- 
ues in the course of time, and the work 
performed by the force (for positive /) 
continues to increase all the time. 

If the force is a function of velocity 
(as is the case of friction), the situation 


20 * 
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will be similar to the one discussed 
above: the body can return to its original 
position, but the work of the force will 
not be zero. In the case of friction the 
force is , negative (see the exercises 
below). 

Exercises 

9.1.1. Find an expression in the form of an 
integral for the work of friction, the force of 
friction being proportional to the velocity of 
the body and in the opposite direction, F = 

— hv, with h positive. Demonstrate that the 
work is negative. 

9.1.2. Suppose the force of friction is con- 
stant in magnitude and opposite in direction 
to the velocity, that is, F — —h for v > 0 
and F = -f h for u < 0. Suppose the body 
moves in accordance with the law x — B sin c ot. 
Find the work of the force of friction. during_ 0 

the time interval from t = 0 to t — ji/co. 

9.1.3. The force acting on a body is given 
by the formula F — f 0 sin oo 0 £, with / 0 con- 
stant. Since the body is also acted upon by 
other forces, it moves according to the law 
x — B sin (Dj*. Determine the work performed 
by force F during the time interval from 
t = 0 to t = T. Consider the case co 0 = g^. 

9.1.4. A body is falling according to the 
law x = gt 2 / 2 (the x axis is directed down- 
wards). Find the formula for the work result- 
ing from the air-resistance force F = — aS 
pv 2 /2, with a being a constant of proportion- 
ality dependent on the shape of the body 
(see Sections 9.14 and 9.15), S the cross- 
sectional area of the body (in cm 2 ), p the 
air density (about 1.3 X 10~ 3 g/cm 3 ), and v 
the rate of fall (in cm/s). Also find the formula 
for the work done by the force of gravity F = 
mg, where m is the mass of the body. 

Perform the computations and compare 
the results for a wooden ball of diameter 1 cm, 
a = 0.8, and for a steel bullet of length 3 cm, 
diameter 0.7 cm, a = 0.2, for t = 1 s, 10 s, 
100 s. 

Remark . The idea behind the calculation 
is that we assume the force of air resistance 
to be small compared with the force of grav- 
ity and not noticeably affecting the law of 
free fall. Computing the work done by the air 
resistance and comparing it with the work per- 
formed by gravity, we verify the correctness 
of our starting assumption concerning the 
small role of the force of air resistance. In 
Section 14 we give an exact solution of the 
problem of free fall with air resistance. 

9.1.5. A wind blowing with a speed v Q 
acts on the sail of a boat with a force F equal 
to aSp (v 0 — v) 2 /2 for v < v Q and — aSp (u 0 . — 
v) 2 /2 for v >> v 0 , is the speed of the boat, S 
the area of the sail, p the air density, and a 
a dimensionless coefficient (a is approximate- 
ly unity for the sail put up perpendicular 


to the wind). Find the work done by the force 
of the wind in moving the boat b meters and 
the power of the wind force. (It can be as- 
sumed that the boat is in uniform motion, that 
is, the speed v is constant). Determine the work 
and power as functions of v. Finally, find the 
maximum power for v 0 = 30 m/s, a = 1, 

S = 100 m 2 , and express the result in watts. 

9.1.6. A body is moving according to the 
law x — B cos (cof + a) under several forces, 
including a force that is time-dependent, 

F = f cos cot . Find the work performed by 
the force during the time interval from t = 
t l to t = f 2 , in particular, during one period 
of operation (from t = 0 to t = 2jt/o>). De- 
termine the average power of the force. 

9.2 Energy 

We consider the case of a force that 
depends solely on the position (coordi- 
nate) of the body, F = F (x). As we 

^ 'iWv y ^ OfO (Jtu , ~ ’em’ ~fji!lS CcCllCtJ ~ £ U1 

this kind of force is the force with 
which a spring acts on a body, the 
other end of the spring being fixed. 9 2 In 

X2 

that case the expression A = j F dx 

X\ 

may be applied without any complica- 
tions (compare with Section 9.1). 
In particular, in this case if the body 
first moves in one direction from x x 
to %ax = X and then in the opposite 
direction returning to the initial posi- 
tion, we have x 2 = x l9 and the total 
work done by the force is actually 
equal to zero: 

X2—XI 

A= j F(x)'dx = 0. 

XI 

Dividing the path length into sections 
only corroborates this conclusion: 

x X 2 

A = J F dx + j F dx 

x 1 X 

X X 

— ^ F dx— j F dx, 

Xi X* 

and A = 0 at x x = x 2 . 

9 - 2 If the’second end of the spring is allowed 
to move at random, the force acting on the 
body will depend not only on the position of 
the body but also on the position of the second 
end of the spring and this does not satisfy 
the stated condition. 
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In mechanics, potential energy is 
defined as the capacity to do work. 
A spring possesses a definite reserve of 
potential energy depending on how com- 
pressed or stretched it is. If one end is 
fixed in position, the potential energy of 
the spring depends on the position of 
the body to which the free end of the 
spring is attached. Thus, the potential 
energy u = u (x) is a function of the 
x coordinate. If in the initial position 
x = x x the potential energy is u (x-f), 
after the body has been displaced to 
position x = x 2 , when the spring has 
performed work A equal to 
oo 

X2 

A — j* F (x) dx , 

XI 

the remaining potential energy is equal 
to u — A . Thus, 

X 2 

u (x 2 ) — u (xi) — A = u (. x — j F (x) dx . 

XI 

(9.2.1) 

Get a good feeling of the sign affixed 
to A in this expression: if the spring 
does (positive) work, the reserve capac- 
ity of the spring to do work will dimi- 
nish! The work performed by the spring 
is taken from the reserve of potential 
energy. For this reason the work done 
(that given up by the spring) is equal 
to the difference between the initial 
and final energy of the spring: 

A = u (x-f) — u (r 2 ). 

All formulas involve the difference 
of potential energy in two positions of a 
body. Therefore, if we replace u (x) 
by u (. x ) + C, where C is an arbitrary 
constant, this will in no way affect the 
physical results. Indeed, 

[u (xff) +[C\ — [ u (. x 2 ) + C ] 

= U (x x ) — u (x 2 ). 

The value of u (x) at some given 
point, call it^ 0 , can be chosen quite ar- 
bitrarily. Denote it by u 0 . Then at some 
other point x the value of the func- 


tion u (x) is determined from the for- 
mula (9.2.1) if in it we put x 1 = x 0 
and x 2 = x: 


X 

u (x) — u 0 — j F (, x ) dx. (9.2.2) 

X 0 

That is how the problem of determining 
the potential energy from a given force 
is solved. 

We can pose the converse problem, 
namely, knowing the potential energy as 
a function of x, or u = u (. x ), find the 
force F (. x ). To solve this problem, take 
the derivative of both sides of (9.2.2). 
The derivative of the integral (with 
respect to the upper limit) is equal to 
the integrand, so that 

■^1=— F(«). (9.2.3) 


The minus sign here is essential. The 
force is positive (in the direction of in- 
creasing x) if duldx is negative, that is, 
as x increases, the potential energy u 
decreases. The force is negative (in the 
direction of decreasing x) if duldx is 
positive, that is, as x increases, the 
energy u increases, too. In the latter 
case, obviously, as # decreases, the ener- 
gy u also decreases. This means that 
the force is always in the direction of 
diminishing potential energy. 

Let us examine in more detail the 
example of the spring. Let the body be 
at the origin when the spring is not un- 
der tension (Figure 9.2.1). When the 
body is pulled to the right, the force of 
tension grows in direct proportion to 
the displacement of the body and is 
directed leftwards: 

F = — kx, k> 0. (9.2.4) 

Assume that u 0 = 0, at x = 0, that is, 
regard the potential energy for the 
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Figure 9.2.1 
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nontense spring as zero. This yields 

X X 

u (x) = — ^ F dx — k ^ x dx — k . 


It is easy to see that this u (x) is asso- 
ciated, via formula (9.2.3), with the 
force (9.2.4). 

We consider a second example, the 
force of gravity. Send the z axis upward. 
The force of gravity acts downward and 
is equal to — mg , where g is the accele- 
ration of gravity. It is independent of 
the height z , but a constant quantity is 
merely a special case of a function whose 
values are the same. The important 
thing is that the force of gravity does 
not depend on time or velocity. We 
can therefore make use of the formulas 
derived above. We take as zero the po- 
tential energy of the body at the earth’s 
surface, where z = 0. Then 


u (z) = — j F dz = 
o 

= mgz. 


z 

j ( — mg)'dz 


(9.2.6) 


The potential energy grows linearly 
with increasing height of the body 
above the earth’s surface. 

In the latter example we assumed 
that the distance z is small compared 
with the radius of the earth. Let us now 
examine the attractive force on the as- 
sumption that the distances can be 
arbitrarily large. By Newton’s law of 
gravitation, the attractive force is in- 
versely proportional to the square of 
the distance between the bodies. We 
know that for a body above the earth’s 
surface the force of gravitation towards 
the entire globe is equal to the force of 
attraction to a mass equal to the earth’s 
mass and concentrated at the earth’s 
center. 9 3 It is therefore convenient to 
reckon distances from the center of the 
earth and denote them by r. Then the 


9 - 3 This does not hold true for a body 
inside the earth, in which case we must allow 
only for the portion of the earth’s mass be- 
tween the earth’s center and the body. 


force of attraction acting on a body is 
directed towards the earth’s center and 
is Clr 2 in magnitude, where the con- 
stant C is positive. 

The constant C can readily he deter- 
mined from the condition that the force 
acting at the surface of the earth 
(r = r 0 ~ 6400 km = 6.4 X 10 6 m) is 
known: F (r 0 ) = mg ~ Clr 0 , that is, 
C = mgrl, where g is the acceleration 
of gravity at the earth’s surface equal 
approximately to 9.81 m/s 2 ). We finally 
have 

(9.2.7) 

For zero we again take the potential 
energy of the body at the earth’s sur- 
face. Allowing for the fact that as the 
distance r from the earth’s center in- 
creases the work performed by force F 
in the process is negative, we get 

r r 

u= — j F dr = mgrl j 7- 

ro ro 

-mgrl ( ~ 7 p - mgr\ ( -7 + 7-) 

=- m g 7- (f — r 0 ) • (9.2.8) 

At a small height z = r — r 0 <C r 0 , 
the ratio rfr differs but slightly from 
unity, whence 

u ( r ) ~ mg (r — r 0 ) = mgz , (9.2.9) 

which coincides with formula (9.2.6) 
obtained earlier. But, as can be seen 
from (9.2.8), the potential energy does 
not increase without limit as r increases, 
as would have been the case in ac- 
cordance with the approximate formu- 
la (9.2.6), but tends to a definite limit 

u (oo) = mgr 0 . (9.2.10) 

Thus, making allowance for the de- 
crease of gravity with distance, we can 
say that the energy of a body at an 
infinite distance from the earth is the 
same as, by the approximate formula, 
at a distance of r 0 from the earth’s sur- 
face, or at a distance of 2r 0 from the 
earth’s center. 
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In this problem we encountered a 
physical situation involving infinite 
distance. In this respect we must note 
that in any physical problem we are 
always interested in finite quantities, 
or in our case finite distances. For in- 
stance, if we consider the motion of a 
body and the energy of the body as de- 
pendent on the earth’s gravity, then 
we can be interested in attaining the 
moon, Mars, or other planets, or even 
stars. All these objects involve distances 
that are very very great relative to 
that of the earth’s radius, but they are 
finite! And, obviously, for a physicist 
the expression r = oo means only that 
r> r 0 . 

Suppose we consider the problem of 
launching a rocket to a great height, 
to a considerable distance from the 
earth. We are interested in the energy 
required and the time of flight. Here are 
two cases. 

(a) A space vehicle is to traverse a 
distance of R = 10r 0 , where r 0 is the 
earth’s radius. 

(b) A space vehicle is to traverse 
R = 100r 0 . 

The work needed to tear away from 
the earth and go a distance R from the 
earth’s center is 

A = m gr\(- r ^ --I*)- ( 9 - 2 - 11 ) 

Recalling that r 0 ~ 6.4 X 10 6 m, we 
get A x mg X 5.76 X 10 6 in the first 
case and A 2 ~ mg X 6.34 x 10 6 in the 
second. 

A ten-fold change in distance caused 
a relatively, small changp in the. eaeryy 
required. If we were to replace R by 
infinity, we would get A «, ~ mg X 
6.4 X 10 6 (note that oo is not a num- 
ber). A 1 differs from A oo by 10%, and 
A 2 differs by 1%. That is why, when 
computing work, R may be replaced by 
infinity. But a change in R has a strong 
effect on the time of flight, and for this 
reason when considering the time of 
flight we must never replace R by infini- 
ty. 

To summarise, then, one and the 
same quantity R in one and the same 


problem can either be replaced by infi- 
nity or not, depending on the aspect 
considered. The possibility of such a 
substitution depends not only on the 
quantity R itself (and its comparison 
with other quantities of the same di- 
mensions entering into the formulas, r 0 
in the given case) but also on the struc- 
ture of the formula in which it occurs. 

Returning to the question of poten- 
tial energy of a body attracted to the 
earth, let us find the numerical value 
of u (oo) per unit mass— it is equal to 
gr 0 ~ 9.81 X 6.4 X 10 6 ~ 6.28 X 
10 4 J/g, which is 30 times the heat 
of evaporation of water and 10 times 
the chemical energy of explosives. 

In problems of celestial mechanics 
and in physics it is advisable to choose 
for zero the potential energy of a body 
located at an infinite distance from the 
mass attracting it. Then for the poten- 
tial energy of a body at a distance r 
we have 

r 

u (r) = u (oo) — j F (r) dr = — , 

OO 

(9.2.12) 

where C is the constant in the expression 
for the force, F = — C/r 2 , and can 
be determined from the formula C = 
mgrl if we know the acceleration of 
gravity g at the earth’s surface and the 
radius of the earth, r 0 . 

We can obtain a different expression 
for C. By Newton’s law of gravitation, 
F = — GmMlr 2 , where m is the mass of 
a body attracted to the earth, M is the 

center of the earth, and G the gravita- 
tional constant equal to approximately 
6.7 X 10- 11 N • m 2 /kg 2 = 6.7 X 
10 -11 m 3 /kg-s 2 . Therefore C = GmM. 
Using this formula, we can easily de- 
termine C if we know G and M. 

Indeed, by measuring the attraction 
of two heavy balls of known masses we 
can find G . Only after this, by measur- 
ing the attraction of a body to the 
earth can we find the earth’s mass. 
The form of the function expressing 
Newton’s law of gravitation, that is, 
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the fact that the force is inversely 
proportional to the square of the dis- 
tance, is proved hy comparing the 
attraction to the earth of a body at 
the earth’s surface with the attraction 
of a remote body, the moon, as well 
as by comparing the attraction to the 
sun of planets that orbit the sun at 
different distances — from Mercury to 
Pluto. 

The problem of potential energy of 
two electric charges e 1 and e 2 is quite sim- 
ilar to the preceding one. The inter- 
action force between the charges is 
equal to 

F=k- (9.2.13) 

Here, if the charges are expressed in elec- 
trostatic units (the unit of charge is 
(1/3) X 10~ 9 C (coulomb)) and the force 
in dynes, (1 dyne = 10~ 5 N) k is equal 
to unity; in SI, where charge is mea- 
sured in coulombs and force in newtons, 
k = (1/9) X 10" 13 . There is no minus 
sign in (9.2.13) that we see in the expres- 
sion for the gravitational force. In- 
deed, if e x and e 2 are like charges (both 
positive or both negative), the product 
e x e 2 is positive. But like charges re- 
pulse one another, that is, the force F 
is positive. 

Again defining u (r) so that u (oo) = 
0, we obtain 

u(r) = k-^~. (9.2.14) 

The potential energy of two like 
charges separated by a finite distance is 
positive: they repulse and, moving from 
r to oo can perform work equal to 

u (r) — u (oo) = u (r). 

The potential energy of two unlike 
charges is negative. Indeed, e x e 2 < 0 
if, say, e x > 0 and e 2 < 0. This is clear 
physically: since unlike charges at- 
tract each other, energy must be ex- 
pended to pull them apart to infinity. 

Note that thanks to the law of conser- 
vation of energy, the potential energy 
may be defined not only as the capaci- 


ty to do work but also as the work re- 
quired to bring a system to a given state. 
A stretched spring can do a definite 
amount of work in returning to the un- 
stretched state. That, clearly, was the 
work that had to be done to stretch the 
spring in the first place. Similar as- 
sertions may be made in the case of a 
body raised to a definite height above 
the earth or for a system of two 
charges. 


9.3 Equilibrium and Stability 

We consider a body that can move with- 
out friction along a straight line, which 
we take for the x axis. Let the 
body be acted upon by a force directed 
along this axis and dependent on the x 
coordinate. We can again picture the 
spring. Below we will examine other 
examples as well. 

The equilibrium position of a body is 
defined as that position for which the 
force is zero and the body is at rest. 
Denote the point of equilibrium by x 0 . 
Then / (x 0 ) = 0. Expanding the func- 
tion F = F (x) in a Taylor series and 
ignoring all powers of x — x 0 except 
the first, we see that two versions of 
the function F (x) are possible in the 
neighbourhood of point x 0 (provided 
F (a: 0 ) = 0), namely, F (, x ) k x (x — 
x 0 ) and F (x) ~ — k 2 (x — x 0 ); in 
both formulas it is assumed that k x 
and k 2 are positive quantities. The first 
case is shown in Figure 9.3.1a, the sec- 
ond in Figure 9.3.16. 

These two cases are associated with 
an entirely different character of equi- 
librium. In the case of Figure 9.3.1a, 
if the body is somewhat to the right of 
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point x = x 0 , then it is acted upon by 
a positive force, that is, a force which 
pulls it farther rightwards. Thus, the 
equilibrium at point x = x 0 in Figure 
9.3.1a is unstable. A slight deviation 
of the body (whether to the right or left 
makes no difference) suffices for a force 
to begin to act on the body, and this 
force will increase the deviation. On 
the other hand, in the case of Figure 
9.3.16 the force is negative (pulls 
leftwards) when the body deviates to 
the right. Deviation of the body from 
the equilibrium position gives rise to a 
force that tends to return the body to 
the position of equilibrium. Here we 
have to do with stable equilibrium. It 
is easy to see that for a body attached 
to a spring the second case is realized. 

In accordance with the above expres- 
sions for force, we find the expressions 
of potential energy via (9.2.2). In the 
case of unstable equilibrium, 

u(x) ~u (x 0 ) — ~ k t (x — x 0 ) 2 . 

In the case of stable equilibrium, 

u(x) ~u (x 0 ) + Y k 2 (x — x 0 ) 2 . 

The appropriate curves are shown in 
Figure 9.3.2a and 6. 

Thus, in the case of unstable equi- 
librium, the potential energy has a 
maximum , while in the case of stable 
equilibrium it has a minimum. In 
both cases the force is zero at the point 
of maximum or minimum, that is 

F = — {du!dx) x=X(i = 0. 

This result is quite natural. If a body 
is in the state of maximum potential 
energy, energy is released during dis- 
placements in both directions. This ener- 
gy can be used to overcome inertia and 



is converted into kinetic energy. But 
if the body is in the state of minimum 
energy, energy from an outside source 
is required to move it to any other po- 
sition. This energy will go to increase 
the potential energy. A small expendi- 
ture of energy will displace the body 
only a small distance. These properties 
of a body in a position of minimum po- 
tential energy fully accord with the 
concept of stable equilibrium. 

Take the case of gravity near the 
earth’s surface. The potential energy is 
mgz , where z is the height above the 
surface. The curves depicting the func- 
tion u (x) can be visualized as curves 
indicating the altitude z of a body as 
a function of the horizontal x coordi- 
nate. We have to imagine a body in mo- 
tion along a curve like a bead on a stiff 
wire. The u versus x curve corresponds 
to the shape of the wire if the plane of 
the drawing is vertical. Then it is clear 
that the maximum of u (x) corresponds 
to (see Figure 9.3.2a) a point on the wire 
from which the bead slides downward 
at the slightest touch, moving to the 
right or to the left away from the maxi- 
mum, and the minimum of u (. x ) (Fig- 
ure 9.3.26) corresponds to the lowest 
point, at which the bead is in a stable 
position, and any other beads on the 
wire would strive to take up that posi- 
tion. 

Thus, the graph of u (x) gives a pic- 
torial visualization of the direction of 
forces and character of equilibrium. 

Let us examine a few examples. 

Example 1 . Let a charged body be in 
motion along a straight line (which we 
take for the x axis) on which two iden- 
tical charges e x are fixed symmetrically 
about the origin at a separation of 2a 
(Figure 9.3.3). 

It is quite clear that at the origin 
the body is in a state of equilibrium. 



Figure 9.3.2 


Figure 9.3.3 
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Figure 9.3.4 

Indeed, the forces acting on the body 
from the fixed charges are equal in mag- 
nitude and opposite in direction so 
that they balance, which means their 
resultant is zero. 

The potential energy u (x) of a body 
with a charge e is 


where r' is the distance to the left-hand 
•charge, and r" the distance to the right- 
hand charge, that is, r' = x + a and 
r" = a — x. Whence 

«(*) . (9.3.1) 

The appropriate curves are shown in 
Figure 9.3.4. The upper curve corre- 
sponds to e x e > 0, which is the case of 
like charges of body and fixed charges, 
while the lower curve corresponds to 
e x e < 0, which means that the body 
has a charge opposite to the fixed 
charges. 

In the case e x e < 0, equilibrium at 
the origin is unstable. Indeed, the body 
is attracted both by the left and the 
right charge and at the origin the 
forces of attraction balance. But if the 
body is displaced the slightest bit in 
any direction, say to the right, the at- 
traction on the right will exert a stron- 
ger effect and will continue to pull it 
rightward. Similar reasoning shows 



Figure 9.3.5 


that in the case e x e > 0, the equilib- 
rium is stable. Here, if the body is dis- 
placed rightward, the (greater) repul- 
sion on the right will tend to move it 
into the initial position. 

Let us find (d 2 uldx 2 ) x== Q. Using (9.3.1), 
we get 

~d* r = 2e,e [ (a+*) 3 + (a — x)» ] ’ ( 9 ' 3 ' 2 ) 


Putting x = 0 in (9.3.2), we find that 

d 2 u I _ 

dx 2 jx=0 ~ a 3 ’ 


Hence, if e x e is positive, then d?u!dx 2 > 0 
at x = 0, the function u (x) has a 
minimum at x = 0, and the equilibri- 
um is stable. But if e x e is negative, 
d^uldx 2 <0 at x = 0, the function 
u (, x ) has a maximum at x = 0, and the 
equilibrium is unstable. 

Example 2. We consider a situation 
in which the charges are spaced in the 
same way from the origin, but along 
a straight line perpendicular to the line 
(the x axis) along which the charged 
body is in motion (Figure 9.3.5). In this 
the potential energy is 


u (a:) — 2 


e^e 

Y a 2 + x 2 


The graph of the function u (x) at a = 1 
and | e x e | = 1 is shown in Fig- 
ure 9.3.6. Here equilibrium at the ori- 
gin is unstable for e x e > 0. If the charge 
e of the body is opposite to the fixed 
charges e 1 (e x e < 0), the equilibrium is 
stable . 

This is easy to establish if we exam- 
ine the force acting on a moving charge 
(Figure 9.3.7). Let e x e be positive. Dis- 
place the body rightward from the 
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position of equilibrium. The resultant 
force of repulsion is also directed to the 
right, further increasing the deviation. 
The equilibrium is unstable. If e x e 
is negative, the resultant force is in 
the direction of decreasing deviation 
and the equilibrium is stable. 

These results are also readily arrived 
at by considering d 2 u!dx 2 at x = 0 (do 
this yourself). 

Note that for e x e > 0, when stabili- 
ty occurred in Example 1 (Figure 9.3.3), 
in Example 2 (Figure 9.3.5) we had 
unstable equilibrium. For e x e < 0 (un- 
like charges) the situation was reversed: 
the equilibrium is unstable for the arra- 
ngement of charges as given in Figure 
9.3.5 and is stable for their arrangement 
shown in Figure 9.3.5. 

Turning Figure 9.3.5 through 90°, we 
note that actually it refers to the same 
initial distribution of charges in the 
equilibrium position as in Figure 
9.3.3. We can say that Figures 9.3.3 
aha rdier 'to "tne same’ rrittidi "dis- 

tribution of charges, but the directions 
of motion under consideration dif- 
fer. Then the equilibrium will al- 
ways (for any signs of charges) be un- 
stable in one direction or in the other. 


In electrostatics it is proved that 
this result is general: there is no point 
of equilibrium in the space between 
external fixed charges such that equi- 
librium is stable relative to displace- 
ments in any direction. 

The general proof of this fact given 
below may appear too hard for the 
reader and he can skip it without any 
loss of continuity in the book. 

For proof in the general form, note 
that the potential energy of a charge 
e at point (x, y, z) as a function of the 
charge’s distance r from a fixed charge 
e x located at point (x x , y x , z x ) is 


e i e 

Consider the motion along the x axis 
and find d 2 u!dx 2 for y and z constant 
(this quantity is known as the second 
partial derivative of u with respect to 
x and is denoted by d 2 uldx 2 ). Then in 
a similar manner we find d 2 uldy 2 and 
d 2 u/dz 2 , which refer to motion along 
the y and x axes, respectively. It turns 
out 9 * 4 that for arbitrary x, y, z, x x , 
yn and z 1 the sum of the second deriva- 
tives along the three perpendicular axes is 
zero' 9 * 


d 2 u b 2 u , d 2 u 

dx 2 dy 2 ' dz 2 


(9.3.3) 


Obviously, this property will be 
preserved for the sum of any number of 
terms ot tlie Torm e k e/r k , where e k 
is the fixed charge at point (x k , y k , 
z k ), and r k is the distance of charge e 
from this point. Consequently, formu- 
la (9.3.3) is valid for any distribution 
of fixed charges in space. 



9 * 4 The reader should convince himself 
of this. Note that the point (x ±J y ± , z t ) at 
which the fixed charge is positioned is special 
in that the function and its derivatives become 
infinite; we will ignore this point in our dis- 
cussion. Various ways in which such points 
can be considered are discussed in Chapter 16. 

9 - 5 Functions that satisfy this condition 
are known as harmonic functions ; they are 
widely used in many fields of physics (see 
Chapters 15 and 17). 


Figure 9.3.7 
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In particular, this formula is valid 
at the point at which charge e is in 
equilibrium. But for equilibrium it is 
necessary that the sum of the projec- 
tions of the forces on each of the axes at 
this point be equal to zero. For this we 
must have 


du p, du du p| 

~dx~ ~ U ’ ~dy U ’ ~dz U ’ 

since if the projections of a force on three 
perpendicular axes are zero, so is the 
force (a vector quantity), that is, the 
projection of the force on any direc- 
tion is zero. 9 6 

For equilibrium to be stable rela- 
tive to motion along all three perpen- 
dicular axes, it must be true that 


d*u 
dx 2 


>o, 


dy 2 


> 0 , 


d 2 u 
dZ 2 


> 0 . 


(9.3.4) 

But this contradicts (9.3.3) since the 
sum of three positive quantities cannot 
be zero. 


Exercises 

9.3.1. A charge e moves along a straight 
line on which are fixed two positive charges e 1 
and e 2 = Ae ± at a separation of 2a. Find the 
point on the straight line at which equilibrium 
of the charge e is possible and determine the 
type of equilibrium. Consider two cases, e 3> 0 
and e < 0. 

9.3.2. Solve exercise 9.3.1 when the sign 
of e 2 is opposite to that of e x . 


9.4 Newton’s Second Law 

Newton's second law states that the 
product of the mass by the acceleration is 
equal to the force applied , 9 - 7 Accelera- 
tion a is the derivative of velocity v 
with respect to time; in turn, velocity 


9 - 6 If we have a nonzero force F (a vector) 
acting in some direction, then there will be 
a force acting along each axis equal to the 
projection of the force F on the axis. 

9 - 7 Newton’s first law , the law of inertia , 
states that any body on which no forces act 
moves translationally and uniformly. This 
means that the acceleration is equal to zero 
for a force equal to zero. Thus, the first law is 
contained in the second as a particular case. 


is the derivative of the coordinate of 
the body with respect to time. Thus; 

ma = m ~ = F, (9.4.1) 

or 



(9.4.2) 


We begin with the case where the 
force is given as a function of time, 
F — F (t). This means that the deriva- 
tive d 2 x!dt 2 is given as a function of 
time. Using Newton’s law (9.4.1), we 
can easily find the velocity at any 
given instant of time. Besides the ap- 
plied force we also have to specify the 
velocity at some time t 0 . Then 

t 

v(t)=v(t 0 ) + -^- [ F(t)dt. (9.4.3) 

to 

Knowing the velocity as a function of 
time, v = v (t), and the initial posi- 
tion x (t 0 ) of a body, we can find the po- 
sition of the body at any given moment 
in time: 

t 

x ( t ) — x (t 0 ) + J v (t) dt , (9.4.4)* 

*0 

where u ( t ) is given by (9.4.3). 

The relationship between velocity 
and distance is considered in detail 
together with examples in Chapter 2. 

On the whole, formulas (9.4.3) and 
(9.4.4) solve the problem of finding 
x ( t ) from Eq. (9.4.2), which is a 
second-order ordinary differential equa- 
tion , involving the second derivative of 
the unknown function x ( t ). The answer 
includes not only the given function 
F ( t ) but also two constants defined from 
the initial conditions, the position and 
the velocity of the body at a given time 
t 0 . 

If the law of motion of the body is- 
given or has been experimentally estab- 
lished, that is, we know the function 
x ( t ), it is easy to find the force applied 
to the body: to do this we must find 
the second derivative of the function 
x ( t ) and multiply it by m (formula 
(9.4.2)). 
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Exercises 

9.4.1. Find the law of motion of a body 
acted upon by a constant force F if at time 
t = 0 the body is at rest at the origin x = 0. 

9.4.2. The same provided that (a) x — 0 
and v = v 0 at t = 0, and (b) x = x 0 and v = 
v 0 at t = 0. 

9.4.3. A body of mass 20 kg begins to 
move under a force of 1 N from the origin 
without an initial velocity. What distance 
will it cover in ten seconds? 

9.4.4. A ball falls from a height of 100 me- 
ters (initial velocity zero). How long will it 
take the ball to reach the ground? (Disregard 
air resistance.) 

9.4.5. Under the conditions of the preced- 
ing problem, the ball starts falling at a veloc- 
ity v 0 = 10 m/s. Examine two cases: (a) the 
initial velocity of the ball is directed downward 
and (b) the initial velocity is directed upward. 
Determine the time required to reach the 
ground and the velocity the ball will have 
at the time of impact. Verify that in cases 
{ a ) and (b) the velocity of impact is the same. 

9.4.6. A body is acted upon by a force pro- 
portional to the time that elapses from the 
beginning of motion (the constant of propor- 
tionality is equal to k). Find the law of mo- 
tion of the body if it is known that the body 
begins moving from point x = 0 at an initial 
velocity v 0 . 

9.4.7. A body is acted upon by a force pe- 
riodically varying in time, F = f cos o )£, 
where / and co are constants. 

(a) Find the law of motion of the body pro- 
vided that x = 0 and v = 0 at t — 0. Establish 
that this is oscillatory motion. Determine the 
period of oscillation, the maximum value of 
x ( t ), and the greatest value of the velocity. 

(b) The same for a force F — f sin cot 
and x = 0 and u = 0 at t = 0. 

9.4.8. A body is in motion under a con- 
stant force F. At time t = t 0 the body is at 

oint x — x 0 . Find the velocity the body must 
ave at t = t 0 so that at t = t ± the body will 
reach point x = x x . 

9.5 Impulse 

The problem of finding the law of mo- 
tion of a body for a given dependence of 
the force on the time was, in principle, 
solved in the preceding section. Here 
we will examine the properties of the 
solution and certain new concepts as- 
sociated with the solution. 

The product P = mv of mass by veloc- 
ity is called the quantity of motion , or 
momentum , while the quantity 

t 

I (t 0 , t) = J F ( t ) dt 

t 0 



Figure 9.5.1 

is known as the impulse of the force 
during the time interval between t 0 
and t. Formula (9.4.3) may be written 
as follows: 

t 

P(t)-P(t 0 ) = j F dt { = I (t 0 , t)). 

to 

(9.5.2) 

In other words, impulse equals change 
in momentum. 

There are forces that act during very 
brief time intervals (an instance is the 
blow of a hammer and the rebound af- 
ter striking a body). Both prior and fol- 
lowing the blow, the force is equal to 
zero. It is clear that in the absence of 
other forces (other than the brief blow) 
the body prior to the blow moves with 
a constant velocity and after the blow 
with another, also constant, velocity. 

Let F ( t ) differ from zero only dur- 
ing the interval between t x and t 2 
(Figure 9.5.1). We consider the in- 
tegral 
t 2 

/ = j F(f) dt. (9.5.3) 

1 1 

It may be called the total impulse in 
the sense that the integral is taken over 
the entire interval of time during which 
the force acts. 

The expression (9.5.1) involves an 
integral from t 0 to t. If t 0 < t x and 
t > t 2 , then 

t 

I (t 0 , t) — j F dt = /. 

to 

Indeed, we write 

t n t 2 t 

j F dt--= | F dt ^ F dt-{- ^ F dt. 

to t Q 1 1 *2 


(9.5.1) 
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Figure 9.5.2 

The first and third integrals on the 
right-hand side are equal to zero, since 
F = 0 over the respective intervals, 
and the second (middle) integral is I. 
Thus, from (9.5.2) and (9.5.3) we get 
P(t) = P(t 0 ) + I if t 0 < t x and 
t > t 2 . 

From formula (9.4.3) we see that the 
velocity following the blow depends so- 
lely on the impulse of the force, that is, 
on the integral of the force, and not on 
the particular type of the function 
F = F ( t ). For example, several diffe- 
rent curves of F (£) shown in Figure 
9.5.2a all yield the same impulse, 
which is to say, they all change the ve- 
locity of a body by the same amount. 
It is not difficult to draw the appropri- 
ate graph of velocity, v = v ( t ), for each 
of these curves representing F ( t ). Fig- 
ure 9.5.2 6 depicts these graphs un- 
der the general assumption that the 
initial velocity is equal to zero. The 
common element of all the curves in 
Figure 9.5.2 6 is the finite value of ve- 
locity: all the curves go into a horizon- 
tal straight line on the right at a height 
v = Hm. 

Each of the curves representing F ( t ) 
in Figure 9.5.2a may be compressed 
along the axis of time and proportionate- 
ly stretched along the axis of force. 
The area under the curve of F, that is, 

I* F dt , the total impulse, does not 

change in the process. That is precise- 
ly how, say, curve 2 in Figure 9.5.2 a 
was obtained from curve 1 . 



Figure 9.5.3 

The shorter the time of action of a 
force, the shorter the time interval dur- 
ing which the velocity of a body 
changes from the initial value v 0 = 0 
to the final value v = Hm (Figure 
9.5.2 6). Thus, in the limit of an extre- 
mely great force acting over an extre- 
mely short time interval, the graph of 
velocity takes the shape of a step (com- 
pare Figure 9.5.3a with curve 2 in Fig- 
ure 9.5.2 6). It is not essential here 
which of the curves in Figure 9.5.2 a 
we compressed— the step is characte- 
rized by only one quantity, v = I/m y 
and this quantity is the same for all 
the curves. 

If prior to the application of the 
force the body was at rest at point x 0J 
after a brief application of a great force 
the body begins to move with a con- 
stant velocity equal to I/m. If the force 
acted at time t ± (we consider the in- 
terval between t x and t 2 to be small and 
therefore do not distinguish between t 2 
and £j) the position of the body as a 
function of time is given by the formu- 
las 

x—x 0 if t<Ct i, 

/ . (9.5.4) 

x— -x Q -\ — — (t — tj) if 

The appropriate graph is shown in Fig- 
ure 9.5.36. Note that x = x (t) satis- 
fies the equation 
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We recall that on the graph of x (t) 
the first derivative dxldt is connected 
with the slope of the tangent to 
the curve, while the second derivative 
d 2 x!dt 2 describes the rate of change of 
the first derivative, that is, the second 
derivative is associated with the cur- 
vature of the curve x = x (t) (see Sec- 
tion 7.10). 

In Figure 9.5.36 the curve x = x (t) 
has a salient point at t = t ly x = x 0 . 
The salient point can be regarded as a 
point at which the curvature is infinite, 
so that the existence of such a point 
corresponds to consideration of a very 
large force (infinite in the limit). How- 
ever, both before and after the sali- 
ent point the derivative dxldt is finite. 
This means that a very large force act- 
ed over a very short time interval, so 
that the impulse is finite. 9 8 The impulse 
can readily be found from the graph 
(see Figure 9.5.36) by computing the 
velocity after the application of the 
force and employing formula (9.5.2). 

The law that we have found concern- 
ing the motion of a body that up to 
time t = t was at rest and at that mo- 
ment received an impulse I will help 
us to refine the formulas (9.4.3) and 
(9.4.4). For this we need a special case 
of the formula (9.5.4) where the body 
is at the origin x 0 = 0 at t = t. We 
introduce the notation 

rO if tC t, 

« <><• 

(9.5.5) 

If we substitute v ( t ) from (9.4.3) 
into (9.4.4) and use more accurate des- 
ignations (so that the upper limit and 
the variable of integration have differ- 
ent letters), we get an expression that 
at first glance is rather unwieldy: 

x(t)=x(t 0 ) + (t — t 0 )v(t 0 ) 
t n 

+ -i- j dt l j F (t 2 ) dt 2 . (9.5.6) 

to t o 

9 . 8 For greater details involving the vari- 
ous mathematical constructions associated 
with this example see Chapter 16. 


The third term on the right-hand sidn 
can be transformed by the formal rules 
for handling iterated (sometimes called 
multiple) integrals. However, since we 
have not mentioned such rules any- 
where, we get the transformed expression 
(in the form of a single integral) by 
using the law (9.5.5) of the motion of a 
body under the action of a single im- 
pulse of force. 

The action of force F(t) during thn 
time interval At from t to t + At 
can be approximately replaced by tho 
(instant) action of the impulse A I = 
F (t) At. We already know the mo- 
tion of a body under the action of such 
an impulse— see formula (9.5.5) in 
which / is to be replaced by A I (t). 

It then only remains to combine thn 
contributions of all intervals A t t to 
coordinate x ( t ) to get 

x (0 — 2 x i (*> x ) 

= Z^-( t —' t ) F ( x ) Ax 

t 

j Fix) (t t) t dx . (9.5.7) 

to 

Here, as usual, we replaced the sum of a 
large number of terms corresponding 
to small intervals At by the integral 
(the right- and left-hand sides of this 
formula can now be connected by a 
strict equality, ignoring the interme- 
diate terms). Formula (9.5.7) does not 
take into account the initial coordinate 
x (t 0 ) and the motion with the initi- 
al velocity, (t — t 0 ) v (£ 0 ), since in 
(9.5.7) we assumed that such motion 
is absent ( v ( t 0 ) = 0 and x (t 0 ) = 0). 
A more general expression can be ob- 
tained if we add these terms to the right- 
hand side of (9.5.7): 

x{t)^=x(t 0 ) + (t — t 0 )v(t 0 ) 

t 

+ ~ j F (t) (t — t) dx. (9.5.8) 

U 

The advantage of formula (9.5.8) 
over (9.5.6) is that in (9.5.8) we have 
to integrate only once. We did not state 
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why we can merely add the terms 
corresponding to individual impulses 
(involving the initial velocity and the 
initial coordinate). This is examined 
in more detail in Section 17.4. Here 
it suffices for us that we can directly 
verify the values of x (t 0 ), (dx/dt) t==0 , 
and d 2 x!dt 2 using formula (9.5.8). To 
this end we must differentiate x (t) 
in (9.5.8) in a manner similar to what 
was done in the verification of formula 
<8.3.5). 

The reader will recall that by New- 
ton’s third law, in the interaction of 
two bodies the force with which the sec- 
ond body acts on the first, is equal 
in magnitude and opposite in direction 
to the force with which the first body 
-acts on the second, F 2 : 9 * 9 

F 2 ( t ) = -F, (t). 

As applied to the first body and force 
F u formula (9.5.2) yields 

t 

PAt)-Pi(to)= j Fidt. (9.5.9) 

to 

The same formula applied to the second 
body and force F 2 yields 

t 

P 2 (t)-P 2 (f 0 )= j F 2 dt. (9.5.10) 

t° 

-Since F 2 = —F 1 by Newton’s third 
law, it follows that 

t t 

j F ^ dt == — j F i dt. 

1 0 to 

That is why (9.5.10) takes the form 

t 

P i (t)~P i {t 0 )= — j F,dt. (9.5.11) 

to 

Comparing (9.5.9) and (9.5.11), we find 
that 

Pi ( t ) - Pi (f 0 ) = P2 (h) - P2 (t), 

whence 

Pi ( t ) + P2 ( t ) = Pi (t 0 ) + P2 (h). 

9 - 9 The subscripton F indicates the body 
acted upon by force F; the subscript on P also 
denotes the number of the body to which the 
momentum . refers. 


This formula shows that the action 
of one body on the other does not change 
the sum of the momenta of the bodies. 

9.6 Kinetic Energy 

Let us consider a body moving under 
the action of a known force F (t) and 
find the relationship between the work 
done by the force and the velocity of 
the body. 

Multiplying both sides of the basic 
equation m (du/dt) — F(t) (Newton’s 
second law) by velocity y, we obtain 

mv-^=--F(t)v. (9.6.1) 

But according to the rule for calculat- 
ing the derivatives of composite func- 
tions (see Section 4.3), the identity 

dv d I v 2 \ 

v IT IT \ ~2T j 

is valid no matter what the function 
v (t). Using this fact, we can rewrite 
(9.6.1) as 

m lTt { J t)= F W v ' 

or, since m is constant, 

=-((>-. 

Introducing the notation 

(9.6.2) 

we finally obtain 

^r = F(t) v. (9.6.3) 

Recalling the expression (9.1.1) for 
work, we can write 
it n 

A= j F(t)vdt= j 

to to 

whence 

A = K (ti) - K (t 0 ). (9.6.4) 

The quantity K is the kinetic energy 
of the body. Formula (9.6.4) expresses 
the conservation of energy : the change in 
the kinetic energy of a body is equal to 
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the work done by a force. Formula (9.6.3) 
expresses the law that the rate of change 
in kinetic energy is equal to the power 
developed by a force. 

When the force is given by a definite 
function of time, the impulse and, hence, 
the change in momentum caused by 
the given force are dependent neither 
on the mass of the body nor on its ini- 
tial velocity, since the impulse and the 

n 

change in momentum are ^ F dt. On 

to 

the contrary, the work done by a force 
and the change in the kinetic energy of 
a body under the action of this force 
are essentially dependent, as may be 
seen from (9.6.2)-(9.6.4), not only on 
the force itself but also on the mass of 
the body and the initial velocity. In- 
deed, by acting with a given force over 
a specified time interval on a heavy body 
at rest at the start of motion we im- 
part only a small velocity that will re- 
sult in only a small displacement, and 
the work done by the force will likewise 
be small. A light body will take up 
appreciable work and will acquire a 
large energy. If prior to the action of 
the force the body was in motion in the 
opposite direction to the force, the force 
can reduce the energy of the body. 

9.7 Inertial and Noninertial Reference 
Frames 

Picture a body participating in two mo- 
tions at once, say, a man walking in the 
cabin of a ship in motion, or a ball drop- 
ped in the cabin. Suppose that one of 
these motions (that of the ship, in our 
case) is uniform. The question then ari- 
ses whether it is possible, by observing 
the ball falling in the cabin or the mo- 
tion of some other body under the ac- 
tion of an applied force, to establish 
whether the ship is moving or not. To 
put it differently, does the uniform mo- 
tion of the ship influence the character 
of motion of objects on the ship? No, 
it does not affect such motion in any 
way. Experiments have demonstrated 
that the absence of any influence of uni- 


form motion on physical phenomena 
holds true not only for mechanics but 
also for the propagation of light and 
electric and magnetic phenomena. From 
this fact Einstein drew conclusions of 
tremendous importance in developing 
his theory of relativity (we do not ex- 
plain the theory of relativity in this 
book). Newton, in formulating his 
laws of motion, assumed that there is 
absolute time , which flows in the same 
way in the entire Universe, and ab- 
solute space , in which all processes in 
the world take place. In the foregoing 
we tacitly assumed all these concepts 
to be valid. One can imagine, for in- 
stance, that somewhere in space there 
is an origin of coordinates 0, given, say, 
by the point at which three metal rods 
placed at right angles to each other 
and having markings on them (or three 
axes of coordinates, as is commonly 
said) intersect. The same point has a 
watch attached to it. Then everything 
is simple: the coordinates of a body are 
defined as the projections of the body’s 
position on the metal rods (the axes 
of coordinates); the body is at rest if 
its coordinates remain constant. 910 
In the general case, knowing the de- 
pendence of the x, y, and z coordinates 
on time t, one can find its velocity and 
acceleration (more precisely, the three 
components dxldt, dyldt, and dzldt of 
its velocity and the three components 
d 2 x!dt 2 , d 2 y/dt 2 , and d 2 z!dt 2 of its accel- 
eration). Experience (in this case as- 
tronomical observations) has shown that 
Newton’s laws are valid when the ori- 
gin of coordinates lies at the center of 
gravity of the solar system, one axis 
points at the North Star, the second po- 
ints at an appropriate star in the equa- 
torial plane, and the third is perpen- 
dicular to the first two and points at an- 

9 . 10 For the sake of simplicity we always 
assume that the body is very small (that is, 
we are actually talking of a material particle). 
The theory that studies the motion of finite 
bodies, that is, bodies that can deform and 
rotate, is too complicated for this book. If 
we speak of the motion of a finite body, we 
always mean the position and motion of its 
center of gravity. 


21-0946 
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other star (in other words, this star lies 
on the axis that is perpendicular to the 
first two axes). The sun’s mass is 
2 X 10 30 kg, Jupiter’s mass is 1.9 X 
10 27 kg, and the distance between 
Jupiter and the sun is 777.8 X 10 9 m; 
the mass of the rest of the planets is 
much less than that of Jupiter. All 
this positions the center of gravity of 
the solar system at a distance of about 
0.78 X 10 9 m from the side of the sun 
facing Jupiter. The sun’s radius is 
7 X 10 8 m, so that the center of gravi- 
ty of the solar system and, hence, the 
origin of coordinates used in astronom- 
ical calculations lies outside the sun’s 
surface. 

A system of coordinates can also be 
constructed using, say, a historical 
landmark. We could place the origin 
of coordinates at Trafalgar Square in 
London, sending one axis “up” Nel- 
son’s Column, the second horizontally 
to the North Pole, and the third along 
the latitude line in the west-east direc- 
tion that passes through the chosen 
origin. Newton’s laws operate in such 
a. at caaadJj^aa wUJl lass, accu- 

racy than in the above-noted astronom- 
ical system; the reasons for this will 
he discussed later. 

Even without conducting any spe- 
cial experiments, Newton’s laws them- 
selves imply that the system in which 
these laws operate is not unique. In- 
deed, Newton’s second law is 

r- ”*■ < 9 - 71 > 

If to x we add a linear function of 
time, the acceleration does not change, 
since if (t) = x (t) + a + bt , we 
have 

dx-i dx . . d 2 x x _ d 2 x __ „ 

di dt ' ’ dt 2 dt 2 x% 

But if in the old system of coordinates 
the position of the body was character- 
ized by the function x = x (£), what 
system should we take so that the po- 
sition is characterized by x t ( t ) = 
x (t) + a + bt? If at time t — 0 
we shift the origin to point O x such that 


x (0 lf t = 0) = — a, then in the new 
system of coordinates at t = 0 we will 
have x x = x + a. If, in addition, the 
new origin is moving leftward with a 
velocity (or speed) —6, then at time 
t =7^= 0 its position is x (O x ) = — a — bt. 
The position of the body reckoned from 
this origin is 

x \ = #(£) + a + bt , (9.7.2) 

which is just the right value for New- 
ton’s second law to be valid (see above). 
The new system is moving, therefore, 
with a constant velocity — b with re- 
spect to the old system, that is, it moves 
by inertia in the same manner as a body 
on which no forces act. For this rea- 
son it is said that Newton’s laws are val- 
id in inertial systems of coordinates , 
that is, in systems that move by inertia 
with respect to one another without any 
forces acting on them. 

Even before Newton’s time the prin- 
ciple by which all inertial reference 
frames are equivalent was formulated 
by Galileo, who wrote that in the cabin 
of a ship sailing steadily in quiet water 
alL qatjui ucl tha mjmnpji 

as if the ship were at rest. Thus, 
among the great variety of systems of 
coordinates, which can be associated 
with a star, planet, or comet, there 
emerges a narrower class of inertial 
systems, or systems in which Newton’s 
laws hold. 

In a system of coordinates rotating 
with respect to an inertial system, on 
the contrary, there appear additional 
terms in the equations of motion (cen- 
trifugal and Coriolis’s forces), which 
cannot be considered in a book of this 
scope. Rotation of the earth intro- 
duces certain correction terms into the 
equations written in terms of coordi- 
nates associated with the earth. Fortu- 
nately,* ii we settne axes 1 nyTar-cfn stars, 
we introduce, with a great accura- 
cy, a practically nonrotating inertial 
system of coordinates (which is simply 
one of the infinitude of inertial refer- 
ence frames). There was a time when 
some scientists concluded from this 
fact that the very property of inertia 
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depends on the presence of far-off stars; 
now, however, this view is considered 
arrJiAifv,. On a, shonJiL not. difi&uiSL + hfc 
obvious property of inertia; rather, one 
should discuss the properties (precisely, 
the positions) of the stars. The stars 
closest to us are separated from the sun 
by distances of about 10 16 to 10 17 m 
and move (in relation to the sun) with 
speeds ranging from 10 km/s to 50 km/s. 
Although these speeds are not low, the 
angular displacements of the stars are 
infinitesimal; for this reason all sys- 
tems of coordinates associated with 
these stars can be considered inertial 
and in which the sun is at rest. 

But can we do without inertial sys- 
tems of coordinates? One interesting 
case is a system in which the orientation 
of the axes does not change, the axes do 
not rotate with the passage of time, but 
the origin moves along one of the axes 
of an inertial system of coordinates 
with constant acceleration. 

Thus, we take an inertial system of 
coordinates (sometimes called a refer- 
ence frame if a clock is attached to the 
origin) and specify the law by which 
the origin of a new system of coordi- 
nates, 0 1 (x), moves in relation to the old 
origin (the motion is along the x axis): 
x (£) = — / (£). Then x 1 ( t ) = x (t) + 
/ (£) and, hence, 

d*x x __ d 2 x . d 2 f 1 _ p , d 2 f 

dt 2 dt 2 ' dt 2 m x ' dt 2 

Measuring x x (t) and applying Newton’s 
second law to this quantity, the obser- 
ver will conclude that there is a force 
m (d 2 f/dt 2 ) acting on the body, in addi- 
tion to the force F x . If the observer 
compares the behavior of various bodies 
in the new system, he or she will 
arrive at the conclusion that this addi- 
tional force is proportional to a body’s 
mass, since only in this case the addi- 
tional acceleration d 2 f/dt 2 (which occurs 
in the system) will not depend on the 
mass of a body. 

The force we have just studied resem- 
bles the force of gravity. In other words, 
the phenomena which take place in a 
noninertial system of coordinates mov- 


ing with a certain acceleration are 
similar to those that occur under the 
f nrA5c ofx . Thifb mmab^ simian, 

line of reasoning led Einstein to his 
basic assumption when he constructed 
the modern theory of gravity, or the 
general theory of relativity . 

Let us examine the famous thought 
experiment involving a very special 
noninertial system of coordinates, or 
noninertial reference frame. Consider 
an observer in an elevator. As long as 
the elevator is at rest, the observer feels 
only the force of gravity. For instance, 
an observer standing still presses on 
the floor with a force Afg, where g 
is the acceleration of gravity, and M 
is the observer’s mass. 

Now suppose that at time t = 0 the 
elevator is not at the ground floor and 
is cut free of the cable. The elevator is 
therefore in free fall: 

dt* ~ 2 9 

where Z is the altitude at which the ele- 
vator is at time t above the earth’s lev- 
el. The observer inside the elevator 
reckons his position z x from the eleva- 
tor’s floor. The position of the observer 
with respect to the earth’s surface, z , 
is related to z 1 in the following manner: 

Zt = z Z = z z 0 - j . 

Let us write the equation for coordi- 
nate z x of a body (material particle) act- 
ed on by the force of gravity and other 
forces F (elastic forces of springs and 
the like). The force of gravity is — gm , 
where m is the mass of the particle. 
In the system of coordinates associated 
with the earth the equation of motion is 

d*z „ 

m^ = F-gm. 

Using this equation, we can find the 
equation governing the motion of the 
particle in the system of coordinates 
z x associated with the elevator: 



21 * 
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We see that there is no force of gravity 
in this system of coordinates. In a free- 
ly falling elevator the g-force is zero. 
This state is referred to as zero g. For 
all bodies on which no outer forces act 
We have cPzJdt 2 = 0, which yields 
z x = constant, or the state of rest in 
relation to the elevator. (The reader will 
recall that from the viewpoint of an 
observer on the earth, the elevator and 
the particle inside it fall with the same 
Velocity and the same acceleration g.) 

The state of zero g is terminated only 
when the elevator hits the ground (even 
if the impact demolishes the elevator). 
Up to this moment there is no way in 
which experiments conducted inside 
the elevator can determine the pres- 
ence of the earth’s gravity. 

The work on an earlier version of this 
hook was started before man went on a 
space mission. Most of you have seen TV 
programs beamed from space stations and 
satellites. As you could see, the astro- 
nauts experience the state of zero g. 
But a spaceship is usually only some 
200 to 300 km from the earth, where 
the force of gravity diminishes (accord- 
ing to Newton’s law of gravitation) 
only by several percent: 


A g __ 9 Ai? 

8 R 


0 300 

A '■v/ 

6400 — 


— 0.09 


1 = -9%. 

This leads us to the conclusion that 
the total weightlessness (zero g) exist- 
ing in the spaceship is due mainly not 
to the fact that the force of gravity gets 
weaker as one moves away from the 
earth but to the fact that with the rock- 
ets of the spaceship switched off the 
latter is in a state of free fall, just as 
the falling elevator is. 

We can now approach the problem of 
inertiality of the reference frame asso- 
ciated with the sun from a different 
angle. We do not insist that both the 
gravitational potential and the force 
of gravity acting on the sun and other 
planets (and, in fact, all bodies in the 
solar system) because of the attraction 


of other stars in our Galaxy and other 
galaxies are negligibly small. Since 
the density of matter is evenly distrib- 
uted (on the average), both the poten- 
tial and the force of gravity are expres- 
sed by integrals that assume infinite 
values (this is known as the gravitatio- 
nal paradox). But the absolute value of 
the gravitational potential never ap- 
pears in the theory, and the derivative of 
the potential, or the force due to the at- 
traction of other bodies, is also com- 
pletely compensated for by the free fall 
of the solar system regardless of whether 
this force is finite or infinite. This jus- 
tifies a study of the solar system irre- 
spective of the rest of the universe and 
removes the gravitational paradox. 


9.8* The Galilean Transformations. 
Energy in a Moving Reference Frame 

Let us go back to inertial reference 
frames and consider in greater detail the 
relationships between the description 
of one and the same phenomenon in 
different inertial reference frames. Sup- 
pose that an object is in motion in the 
coach of a train that is moving with a 
constant speed v 0 in a direction fixed 
by the rails of the track, the direction 
being that of the x axis. We willassume 
that the train is moving in the direc- 
tion in which x increases. Then the 
coordinate x x of the object with respect 
to an observer who is at rest in relation 
to the track, say, an observer standing on 
the platform of the railroad station 
which the train has left, will be related 
to the coordinate x of the object in a 
reference frame fixed within the coach 
(the origin of the coordinate system may 
be fixed at the back wall of the coach, 
for instance) as follows: 

x x = x + v 0 t + a, 

where a depends on the choice of the 
origin of coordinates x x \ if this origin 
coincides with the station which the 
train has left, then a is the ^-coordi- 
nate of the back wall of the coach at 
time t = 0 (cf. formula (9.7.2)). If, in 
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addition, we assume that the initial 
moment of the “train time” t (the ini- 
tial moment may be assumed to be the 
time of departure of the train from the 
station) differs from the initial mo- 
ment of the “absolute time” t 1 (not con- 
nected with any train), say, we can as- 
sume that t x = 0 coincides with 00:00 
GMT at the station from which the train 
departed, 911 then we must write the 
additional formula t x = t + 6, where 
b is the “absolute” time t x at t — 0 (the 
moment of departure). The transforma- 
tions 

x 1 = x + v Q t + a, t x = t + b (9.8.1) 

determine the relationship between the 
coordinates (x x , tf) and (x, t) of one and 
the same object in two inertial refer- 
ence frames, one of which (the train) 
moves uniformly with respect to the other 
(the station) with a velocity (or speed 
in our one-dimensional case) u 0 , and are 
called the Galilean transformations . 

Let us now assume that the object 
is moving within the coach (in the di- 
rection of increasing xs) with a certain 
speed v . Then 

x = x 0 + vt. (9.8.2) 

We then have 

x x = x + v 0 t '+ a = (x 0 + vt) 

H~ v 0 t + a = (x 0 + a) 

+ (v Q + v) t. (9.8.3) 

(If we substitute t 1 — b for f, we can 
rewrite the last formula as x x — 
l(*o + a) — (v + v 0 ) b] + (v + v 0 ) t lt 
but for the sake of simplicity 
we will not distinguish between t x 
and t.) From (9.8.3) it follows that in 
relation to the observer on the platform 
the object is moving with a speed v 1 = 
v + v 0 . This speed will, of course, 
be different for the observer traveling 
in the coach. However, the accelera- 


9 - n It goes without saying that on a long 
journey the moment 00:00 GMT does not 
change but a new time zone may force us to 
set our watch back or ahead, which makes it 
especially appropriate to distinguish between 
“train time” t and “station time” t x . 


tions with respect to the two observers 
will be the same: 


dv i d , dv 

a i--=—=-jr( v -^ v o) = ^r 


dt 

dv 

dt 


du o 
dt 


since the constant term v 0 = constant 
in the expression for the speed cannot, 
of course, change the acceleration. There- 
fore, a force acting on the object is 
the same for the two observers, or 
F = ma x — ma. 

The difference in speeds before and 
after the force is also the same for the 
observer on the platform and the observ- 
er in the coach. Indeed, let the speed 
of the body in relation to the observer 
on the platform be v' prior to the force 
and v" after the force, while for the ob- 
server on the platform these speeds 
are v[ and v ", respectively. Then v[ = 
v + v 0 and v'' = v + whence 

— v[ = v n + v 0 — v' — v 0 


The situation is more complicated with 
kinetic energy. Not only the kinetic energy 
itself but even the differences of kinetic ener- 
gies are distinct for different observers. For 
the observer standing on the platform, 


K'i-K[ = 


m ( v'l ) 2 m (v[) 2 

~~2 2 
m (^" + ^o) 2 m (v'+v 0 ) 2 


2 

m (v") 2 


m (v f ) 


2 2 
= K" — K' -\~mv o (v" — v'). 


mv 0 v — mv 0 v 


In this formula K\ and K{ are the final and 
initial kinetic energies calculated by the ob- 
server on the platform, and K" and K' are, 
respectively, the kinetic energies calculated by 
the observer in the coach. 

The work done by a force and the power are 
also different for different observers, since 
although the force is the same, the distances 
and the velocities are different for the observer 
standing on the platform and for the observer 
in the coach. However, the law of equality 
of change in kinetic energy and work is valid 
for any observer, although each of these quan- 
tities taken separately differs for different ob- 
servers (see the exercises to this section for 
examples corroborating this fact). 
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Note the following remarkable formula 
Valid for a body moving under the action of 
only one given force F (t): 

ti 

x , mv\ mi?n 

F ( t ) v ( t ) dt = 2 2 

to 

= -y- f'Ji — 'Joj (’-’I — ft) 

= =J2±a. j f ,„*. 

*0 

We see that in this case the velocity (speed) 
v (f) may be taken out from under the integral 
sign and replaced by the arithmetic mean of 
the initial and terminal velocities. 

This conclusion holds true only for the 
case where v (t) is the velocity acquired by 
an object under the action of only one force 
F (t). If the body is acted upon by a number 
of forces, say, F x , F 2 , and F 3l then the work 
performed by all these forces is equal to the 
product of the mean velocity by the sum of the 
impulses of all forces: 

t L 

a =ei±o± J (Fi+F2+jFs) dt 

to 

=-a±2*-ij 

*0 to 

tl 

+ - J ^ l>0 j F 3 dt. (9.8.4) 

to 

However, the work done by each of these 
forces (say, F 2 ) separately is not equal to the 

ti 

corresponding summand — — ^ ^ j* F 2 dt in 

to 

(9.8.4) , since the force F 2 acting separately 
would impart a velocity to the object that 
differs from u ( t ) (see Exercise 9.8.6 below). 

Above we saw that two inertial reference 
frames, (x 7 t) and (x\ t '), specified on the x 
axis are related through the following for- 
mulas: 

x' = x-j- vt -j- a, t' = t -j- b, (9.8.5) 

where v is the velocity of the reference frame 
(x, t) in relation to the reference frame (x\ t'), 
so that the velocity of ( x' , t') in relation to 
( x , t) is — v. The Galilean transformations 

(9.8.5) or (9.8.1) play in mechanics a role 
similar to that played in geometry by motions 
(cf. Section 1.9). The transformation from one 
inertial reference frame to another inertial 
reference frame has no effect on physical phe- 
nomena; hence, the only physically meaning- 
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ful facts and quantities are those which do not 
change under transformations (9.8.5). For 
instance, the linear equation 

x — vt -f- a (9.8.6) 

expresses the fact of uniform motion of a 
point along the x axis (the very straight line 
along which we consider the motion), but the 
specific value of v depends on the choice of 
the concrete reference frame and has no phys- 
ical meaning. (It is for this reason that New- 
ton’s laws contain only the acceleration and 
not the velocity of a material particle.) But 
if we are dealing with two uniform motions, 
expressed via (9.8.6) and x = v x t + a l7 re- 
spectively, the difference v 1 — v has a concrete 
physical meaning: it is the relative velocity 
of one motion with respect to the other motion 
and does not depend on the choice of (inertial) 
reference frame. There is another physical 
quantity with a definite meaning: the inter- 
val t = t 2 — t x between two events, (x x , 
t x ) and (x 2 , t 2 ); if the two events occur simul- 
taneously, t x = t 2 , we can speak of the spatial 
distance (or space interval) d — x 2 — x x be- 
tween these events (the “distance” x between the 
events is measured in units of time, say, sec- 
onds, while the second “distance,” d, which 
has a meaning only at t = 0, is measured in 
entirely different units, say, meters). 

An arbitrary curve x = x (t) in the xt- 
plane (Figure 9.8.1) fixes the law of motion of 
a (material) point along a straight line. The 
slope k — x' (=dx/dt) of the tangent to the 
curve at point (or event) (x, t ), which it would 
be more appropriate to denote by v, gives the 
velocity of motion at this point. If we substi- 
tute for curve T (equation x — x (£)) the tan- 
gent l at the given point (this tangent consti- 
tutes a good approximation of T in the neigh- 
borhood of this point), we transform the ar- 
bitrary motion into the “approximating” uni- 
form motion with a velocity equal to the in- 
stantaneous velocity at the given moment in 
time. The second derivative x" ( =d 2 x/dt 2 ) 
plays the role of the curvature of the curve and 
also expresses the acceleration of motion. If 

d^x 

■ - , - 2 - = a= constant , 
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we are dealing with the law of motion 



with b and c arbitrary constants, which law 
corresponds to uniformly accelerated (if a is 
positive) or uniformly decelerated (if a is nega- 
tive) motion and geometrically corresponds to 
a parabola (9.8.7) in the zJ-plane. Replacing 
an arbitrary curve x = x (t) by the approxi- 
mating parabola IT (9.8.7) with the same values 
of derivatives x' (=v) and x" ( —a ) means re- 
placing an arbitrary motion by a uniformly 
accelerated (or decelerated) motion with the 
same (instantaneous) velocity and accelera- 
tion. Geometrically this replacement means that 
for curve x = x (t) in the zf-plane we have 
substituted the osculating parabola (cf. Sec- 
tion 7.10). Physically, the transition from an 
arbitrary curve to its tangent (at a certain point, 
of course) means eliminating all forces, since 
a straight line in the zZ-plane represents uni- 
form motion, which occurs in the absence of 
all forces (motion by inertia), while the transi- 
tion from a curve to a parabola (9.8.7) is equi- 
valent to the assumption that all forces are 
constant. 9 * * - 12 


Exercises 

9.8.1. Find the formula for the kinetic 
energy of a body moving under a constant 
force F (with zero velocity at the initial time) 
as a function of time and also as a function 
of the distance traveled. 

9.8.2. A body is in motion under a force 
F — f cos G)£, and v — 0 at t = 0. Find the 
expression for the kinetic energy of the body, 
and determine the maximum of kinetic energy. 

9.8.3. A body is moving in accordance with 
the law x = x (t) = A cos (coi + a), with A, 
a), and a constants. Determine the average 
kinetic energy provided that t increases with- 
out bound from t = 0. 

9.8.4. A ball of mass m falls from a height 
H from a state of rest. Demonstrate that the 
kinetic energy of the ball, A, is equal to 
mg (H — h ), where h is the height of the ball 
above the ground at a given instant of time. 

9.8.5. A train with a mass of 500 metric 
tons started out from a station, and in 3 min- 
utes developed a speed of 45 km/h traveling 
1.5 km. Determine (a) the work and the ave- 
rage power of the locomotive on the assump- 
tion that friction on the rails is absent, and 
(b) the same but having regard for friction 
(the coefficient of friction k is equal to 0.004; 
the force of friction is equal to the force of 
attraction of the train to the earth, or the 
train’s weight, multiplied by k). 


9 - 12 See I. M. Yaglom, A Simple Non- 

Euclidean Geometry and Its Physical Basis , 

Springer, Berlin, 1969. 


9.8.6. A body is under two forces: F x — at 
and F 2 = a (0 — t). The impulses of these 
forces over the interval from 0 to 0 are the 
same. At time t = 0 the body has a velocity v 0 . 
Find the work done by each force during the 
time interval from 0 to 0 and compare it 
with the product of the impulse by the av- 
erage velocity. 

9.8.7. A man standing still on the ground 
acts on a given mass m with a force during the 
time interval t. As a result the mass, which 
was originally at rest, acquires a velocity v x = 
Ft/m and a kinetic energy mv\/2 equal to the 
work performed by the man. 

Consider the same experiment done in a 
train traveling at a velocity v 0 . The mass m 
had a velocity v 0 prior to the experiment and 
v 0 + v x after the experiment. Find the change 
in the kinetic energy of the mass m. What 
work was done by the man? Assuming that 
the man rests firmly against the wall of the 
railway car and v 0 does not change, find the 
work of the force done by the train (locomo- 
tive) during the experiment. 

9.8.8. A man of mass M standing in 
skates on ice (friction between skates and ice 
is neglected) acts with a force F on a mass m 
during time t. What kinetic energy will be im- 
parted to the mass ml What kinetic energy 
will the man acquire? What is the total work 
done by the force acting on mass m and on 
the man? Why is it greater than in Exer- 
cise 9.8.7? 

9.8.9. The same experiment as in Exercise 
9.8.8, but the man has an initial velocity z> 0 
and moves together with mass m. The velocity 
of mass m is, after the action of the force, 
v 0 + Ft/m , and the velocity of the man is, 
respectively, v 0 — Ft/M. Find the change in 
kinetic energy of mass m and the man as a re- 
sult of the action of the force. Find the work 
done by the force, which work is equal to the 
change in the total kinetic energy, and com- 
pare it with the result of the preceding exer- 
cise. 

9.8.10. Prove that under the Galilean trans- 
formations (9.8.5) the following quantities 
retain their values: (a) the temporal interval t 
between two events, (b) the spatial distance d 
between two simultaneous events, and (c) 
the relative velocity u x — v of one uniform 
motion relative to another. 

9.8.11. Verify that a transformation to a 
new (inertial) reference frame via the Gali- 
lean transformations (9.8.5) does not change 
the acceleration a of (uniformly accelerated 
or decelerated) motion in (9.8.7). 


9.9* The Path of a Projectile. 

The Safety Parabola 

Let us consider the problem of the 
flight path of a projectile (shell) fired 
from a gun with initial velocity v 0 . 
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We take the point of ejection of the 
shell from the barrel of the gun for the 
origin and send the y axis vertically 
upward, while the x axis is assumed 
horizontal (the motion of the projec- 
tile occurs, therefore, in the £g-plane). 
For the sake of simplicity we disre- 
gard air resistance because this would 
introduce considerable complications. 

By Newton’s second law, 

< 9 - 91 > 

We applied this law earlier only for 
rectilinear motion. However, in the 
flight-path problem the direction of u 
varies with time (the velocity is always 
directed along a tangent to the path 
of the shell). For this reason we are 
forced to assume that the quantities 
v and F in Eq. (9.9.1) are vectors , or 
v = (v x , v y ) and F = (. F x , F y ). 

Any motion in the xy - plane may be 
regarded as the result of combining 
two motions: one occurring along the 
x axis under force F x with velocity 
v x , and the other along the y axis un- 
der force F y with velocity v y . Applying 
Newton’s second law to each of these 
motions separately yields 

m-^ = F x) m^-=F v (9.9.1a) 

(see Figure 9.9.1). In each of these 
equations the force and the velocity 
are directed along a single straight 
line (the x axis in the first equation and 
the y axis in the second). 

Denote by cp the angle which the bar- 
rel of the gun makes with the horizon- 
tal; call cp the angle of departure. 
Since we are considering the most ele- 



Figure 9.9.1 


mentary case in which the shell in 
flight is acted upon solely by the force 
of gravity directed earthwards, it fol- 
lows that F x = 0 and F y — — mg. And 
so the equations in (9.9.1a) have the 
form 

m -7r= ~ mg - ( 9 - 9 - 2 > 

What is remarkable (and important) 
here is that our equations have sepa- 
rated: the first contains only one un- 
known function v x , and the second con- 
tains the unknown function v y . 

Let us find the initial conditions for 
the functions v x (£) and v y ( t ). At the 
time of emergence of the shell from the 
barrel, t = 0, 

v x (0) = v 0 cos cp, v y (0) = v 0 sin cp, 

where v 0 ( = | v 0 | ) is the initial (or 
muzzle) velocity of the shell. The first 
equation in (9.9.2) yields dvjdt = 0, 
from which it follows that v x is con- 
stant, and hence 

Vx (t) = v x (0) = v 0 cos cp. (9.9.3) 


The second equation in (9.9.2) yields 
dvyldt = — g, whence, integrating both 
parts from 0 to t , we get v y (t) — v y (0) = 
—gt, or 

v y (t) = —gt + sin cp. (9.9.4) 


To determine the displacements x 
and y along the coordinate axes, we 
take advantage of the fact that v = 
drldt , where r = (x, y) is the radius 
vector of the shell. This vector equa- 
tion splits into two scalar equations: 


dx 

dt 


V x , 



y> 


(9.9.5) 


or, in view of (9.9.3) and (9.9.4), 

— n 0 cus cp, -^-= —£f + i> 0 sin<p. 

(9.9.6) 

The equations are again separated: the 
first deals only with the unknown func- 
tion x ( t ), and second deals with the 
unknown function y ( t ). 
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At the initial instant of time, the termine the coordinate x 1 of the maxi- 
shell was at the origin, with the re- mum altitude of the shell, y max , we 
suit that set up the equation dyldx = 0 to get 


x = 0 and y = 0 at t = 0. (9.9.7) 

Integrating Eqs. (9.9.6) from 0 to i 
and using the initial conditions (9.9.7), 
we find that 

< 7^2 

x = v 0 t cos cp, y—v 0 t sincp — -y- . 



(9.9.8) 

These formulas enable us to determine 
the position of the shell at any 
instant of time t. 

Taking various values of t , we can 
find the position of the shell from for- 
mulas (9.9.8) at different times and 
plot a graph of the path of the shell. 
Thus, Eqs. (9.9.8) yield a curve in the 
.ry-plane. They determine a parametric 
representation of the path of the shell, 
with t the parameter (see Section 1.8). 

From Eqs. (9.9.8) we can easily ex- 
clude t and obtain an equation of 
the path in the ordinary form, as a func- 
tion of y in x. Indeed, the first equa- 
tion in (9.9.8) yields t = xlv 0 cos cp; 
then from the second we get 

y = xianw — x 2 - 0 -o - g b — • (9.9.9) 

From this we see that y is a second-de- 
gree polynomial in x, the graph of which 
is a parabola. Consequently, disregard- 
ing air resistance, we can say that 
the path of a shell is in the shape of a 
parabola. Figure 9.9.2 depicts the path 
specified by (9.9.9) for the case where 
v 0 — 80 m/s and cp = 45°. 

From (9.9.9) it is evident that for 
one and the same v 0 , the shape of the 
trajectory depends on the angle of de- 
parture cp. We will find the maximum 
altitude of ascent of the shell and the 
range of fire for given cp and u 0 . To de- 
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For this value of x, the height y is at 
its maximum (it is physically clear 
that this is precisely the maximum; this 
can be verified easily by finding the 
sign of d 2 y!dx 2 ). Substituting this 
value of x into (9.9.9) yields 


Vq sin 2 cp 
2 g 


To determine the range of the shell, 
it suffices to find the value x 2 for which 
y = 0 (see Figure 9.9.2): 


x 2 tan cp — x\ 


g 

2v% cos 2 cp 


0. 


Disregarding the solution x 2 = x 0 = 0 
which does not interest us, we find 
that 


x 2 


v% sin 2cp 
g 


(9.9.10) 


The range of fire depends on the ini- 
tial velocity and the angle of departure. 

For what angle of departure (initial 
velocity u 0 unchanged) is the range of 
fire the greatest? Formula (9.9.10) shows 
that this happens when sin 2cp = 1 , 
or cp — 45°, and the range in this case 
is v^/g. Also, since sin 2cp = sin 2 (90° — 
cp) and in view of (9.9.10), for angles 
cp 0 and 90° — cp 0 (say, at angles of de- 
parture cp = 30° and cp = 60°) the range 
is the same and shells hit the same tar- 
get; artillerymen call fire with shells 
having angles of departure less than 45° 
grazing , and fire with angles of depar- 
ture greater than 45° plunging. 

Let us determine the time t x during 
which the shell rises upward. All we 
need to do is solve the equation dy(tf)!dt 
= 0 because at time t = £ 1? when 
y attains its maximum value, the shell 
will cease to rise and will begin to fall. 
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The condition dyldt = 0 yields v 0 sin cp 
— gt = 0, whence 

ti = P ° s -f 9 . (9.9.11) 

The total flight time t 2 can be found 
from the fact that the flight ceases when 
x = x 2 . Combining (9.9.8) and (9.9.10), 
we find that 


v 0 t 2 cos cp 


j sin 2(p 


whence 


t 2 


2 u 0 sin cp 
g 


(9.9.12) 


Comparing (9.9.12) with (9.9.11), we 
see that the total flight time t 2 is twice 
the time of ascent t x for any angle of 
departure, or that the ascent time of a 
shell is equal to the descent time. 

Now let us turn from the problem of 
firing at targets on the ground to 
antiaircraft fire. If for a given value of 
the shell’s initial velocity (this veloci- 
ty depends on the type of gun) we change 
the angle of departure, we obtain a 
whole family of trajectories (Figure 
9.9.3), where the plunging-fire trajec- 
tories, that is, trajectories with 
45° < cp < 90°, touch a single curve 

v=-$-*^3T ( 9 - 9 - 13 ) 

(the grazing-fire trajectories lie inside 
the “dome” (9.9.13) and do not touch 
it). Indeed, solving Eqs. (9.9.9) and 
(9.9.13) simultaneously, we find that 
the curves (parabolas) have only one 
common point: 

* = -y-cotcp, !/=— |L(i_ C os 2 <p) 

(9.9.14) 

(this point lies in the upper half-plane 
only for cp ^ 45°, in view of which on- 
ly plunging-fire trajectories in (9.9.9) 
touch the curve specified by (9.9.13)). 
The fact that the appropriate parabolas 
only touch (that is, do not intersect) fol- 
lows from the sole fact that the common 
point of (9.9.9) and (9.9.13) is unique; 



this can be verified by directly calcu- 
lating the slope of the tangent to both 
curves at point (9.9.14). Parabola (9.9.13) 
can be called a safety parabola — 
if a target A (say, an aircraft) is out- 
side the “dome”, there is no angle of de- 
parture at which it can be hit. The tar- 
get can be hit only by using another 
type of gun. 

Note, in conclusion, that the actual 
flight paths of shells are not exact pa- 
rabolas. In reality the shell experi- 
ences air drag, which we ignored; the ac- 
celeration of gravity g depends on the 
distance from the surface of the earth; 
rotation of the earth also introduces cer- 
tain distortions into the fall of the shell; 
and, finally, Newton’s laws of motion 
of a material point cannot be applied 
without certain reservations to the mo- 
*’* 110 x 1 ~bi~ a“prirjet4 1 ne / Vcrfino a]>if l ac- 
tor is not so important). This, however, 
is not a unique situation— we en- 
counter it whenever we apply mathemat- 
ics to real-life problems; we always ig- 
nore certain factors and are content 
with a rough picture, which enables us 
to use mathematical tools we are accus- 
tomed to. The actual range of fire, the 
altitude of flight of a shell, the time of 
flight, and so on depend on the mass of 
the shell, the shell’s shape, and the air 
density. For instance, when the initial 
velocity of a shell is low (see Figure 
9.9.2 that refers to the case with u 0 = 
80 m/s), the role or air drag is in- 
deed insignificant, while for a great v 0 
there is no way in which the drag can 
be ignored; for a 305-mm caliber gun 
with a muzzle velocity of u 0 = 
800 m/s and cp = 55°, the air drag 
that acts on the flying shell diminishes 
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the flight range from 61 km (which 
value is given by formula (9.9.10) 
to 22.2 km. 

Exercises 

9.9.1. A shell leaves the gun with a velo- 
city of 80 m/s. Determine the range of fire 
and the maximum height reached by the shell 
if the angle of departure <p = 30°, 45°, and 60°. 

9.9.2. Determine the maximum altitude 
at which a shell with initial velocity v 0 = 
80 m/s can hit a target located 500 meters from 
the gun. 

9.10 The Motion of a Body 
in Outer Space 

Let us now turn to the problems of the 
motion of objects (artificial satellites 
and spaceships) in outer space. For a 
body launched from the earth to be- 
come a satellite of the earth, that is, for 
it to orbit around the earth, the centri- 
fugal force on the body must be bal- 
anced by the gravitational force of the 
earth. The respective velocity u x is 
commonly known as the satellite (or 
orbital) velocity . 913 To find u l9 we 
set up the equation 

m ~ = mg, (9.10.1) 

where g is the acceleration of gravity, 
m the mass of the body in question, and 
R the orbit’s radius (the left-hand side 
of Eq. (9.10.1) is the centrifugal force, 
and the right-hand side is the force of 
gravity of the earth). If the orbit is 
low, that is, the body flies close to the 
earth’s surface, R is close to r 0 , the 
earth’s radius, whereby on the right- 
hand side of (9.10.1) we can put the 
force of gravity at the earth’s surface. 
Formula (9.10.1) then yields 

= Y gR ~ V gr 0 ^ 8 km/s. (9.10.2) 

For a satellite that is orbiting the 
earth at a distance R from the earth’s 
center that exceeds r 0 considerably, we 
must take into account the variation 

9,13 We are interested here in steady- 
state motion with a velocity u x (and later with 
velocities u 2 and u 3 ) and ignore the transient 
launching stage. 


of g with altitude; the value of g at 
the earth’s surface is denoted by g 0 
(it is this value that was substituted 
into formulas (9.10.1) and (9.10.2)). 
Indeed, by Newton’s law of gravitation, 
a body placed at a distance R from the 
center of the earth is attracted to the 
earth with a force F = GmMlR 2 , where 
m is the body’s mass, and M the earth’s 
mass. In addition, by Newton’s second 
law, F = mg , where g = g (R) is the 
acceleration of gravity at a distance R 
from the earth’s center. Comparing the 
two expressions for F , we can write 
g = GM/R 2 . If R = r 0 , then g = g 0 , 
whence g 0 = GMlr \ , which yields 
G = gorf/M and, therefore, 

g(R) = g 0 rl/R 2 . 

The condition (9.10.1) for the centrifu- 
gal force being equal to the force of 
gravity then assumes the form 

mu\!R = mg 0 rl/R 2 , 

which means that the satellite velocity 
u-l is given by the following formula: 

u i = (g 0 r 2 0 /R) i ' 2 . 

The greater the distance R , the lower 
the velocity u ± necessary for a satellite 
to remain in orbit. However, this does 
not at all mean that it is easier to 
launch a satellite to a higher orbit than 
to a lower orbit, since to launch a sat- 
ellite to a high orbit we must spend a 
lot of energy in overcoming the force 
of gravity on the trip from the earth’s 
surface to the orbit. Of practical impor- 
tance is the motion of satellites on a 
stationary, or 24-hour, orbit, which 
lies approximately at an altitude of 
36 000 km above the earth’s equator; a 
satellite launched into such an orbit will 
“hang” over a spot on the equator (see 
Exercise 9.10.1). 

Let us now consider a more compli- 
cated problem. For a body to leave the 
bounds of earth’s gravity, its initial 
kinetic energy must be greater than the 
difference between the potential energy 
of the body at a far-off point and that 
at the earth’s surface. This difference 
was found in Section 9.2 (formula 
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(9.2.8)). Here it is assumed that the 
body acquires its speed rapidly over a 
path that is short compared to the 
earth’s radius, so that the variation in 
potential energy over this path can be 
ignored. In other words, we must as- 
sume that the thrust of the reactive force 
operating while the rockets launching 
the satellite are burning (see Section 
9.11) is very high and the force of grav- 
ity can be ignored. 914 

The minimal velocity that a moving 
body (as a rocket) must have to escape 
from the gravitational field of the earth 
or of a celestial body and move outward 
into space is called the escape velocity. 
Let us find this velocity for the earth 
(we denote it by u 2 ). By (9.2.8), the 
initial energy required to reach a point 
in space lying at a distance R from the 
earth’s center, if prior to motion it was 
at a distance r 0 (on the earth’s surface), 
is given by the formula K Q — mg(rJR) x 
(R — r 0 ). In our case R is much larg- 
er than r 0 , whence R — r 0 cz. R, which 
yields K 0 ~ mgr 0 . We assume this to 
be equal to the kinetic energy of the 
rocket: 

mu\!2 = mgr 0 . 

This yields 

u 2 =. ]/ 2gr 0 ~ 11.2 km/s. (9.10.3) 

Combining (9.10.2) with (9.10.3), we 
find that 

u. 1 —y2u i c^\Au u and u\ — 2u\. 

For the sun, this velocity (denoted 
^ 3 ), when imparted to a body, takes the 
body outside the solar system. We will 
find it using the fact that the speed with 
which the earth orbits the sun is known: 
v x ~ 30 km/s. 

9 - 14 It has been found that it is more advan- 
tageous to burn the rocket fuel fast (less fuel 
is required) than to extend the burning process 
over the entire time necessary to travel a dis- 
tance of the order of the earth’s radius. Only in 
the layer where the density of the atmosphere 

is high, and so is the air drag, high velocities 
are not advantageous, but since the thickness 

of this layer is much smaller than the earth’s 
radius (see Chapter 11), we will not take 
this layer into account. 


By Newton’s law of gravitation, the 
force with which a body of mass m is 
attracted to the sun is F = — kMpnir 2 , 
where M 1 is the sun’s mass (M 1 ^ 
2 X 10 30 kg), r the distance from 
the body to the sun’s center, and k 
a constant. The potential energy of the 
body separated by a distance r from 
the sun’s center is 

u (r) — — ^^~m. (9.10.4) 

Here the zero of potential energy is 
taken as its value at infinity (cf. Section 
9.2). 

The value of the potential energy of 
a body which orbits the sun together 
with the earth can easily be expressed 
in terms of the speed with which the 
earth orbits the sun. Indeed, on the 
earth’s orbit the force with which the 
earth is attracted to the sun is balanced 
by the centrifugal force, that is, 


where v x we already know, r x is the 
radius of the earth’s orbit (r x ~ 150 X 
10 6 km = 1.5 X 10 9 * 11 m), and M 2 

is the earth’s mass (this quantity is 
cancelled out from Eq. (9.10.5)). 
Whence, kM 1 = v\r x , and (9.10.4) as- 
sumes the form 

u (rj) = — v\m . 

For a body at a distance r t from the sun 
to leave the bounds of the sun’s grav- 
itation, the sum of the body’s kinetic 
and potential energies at this distance 
must be nonnegative. This leads to the 
following inequality: 

m + u (f\) = m -y- — mv\ > 0, 

(9.10.6) 

where v 2 is the velocity that the body 
must have to leave the solar system, 
and is the known orbital velocity of 
the earth. 

We have already encountered a sim- 
ilar situation when we considered the 
motion of a body in the earth’s gravi- 
tational field: the velocity necessary 
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for a body to leave this field corre- 
sponds to a kinetic energy that is twice 
the kinetic energy corresponding to the 
velocity that ensures that the body 
is an (artificial) satellite of the earth. 

Inequality (9.10.6) yields the mini- 
mal velocity v 2 that a body must have 
to leave the solar system: 

v 2 — ]/ 2u 1 ~ 1.4 u 1 ^ 42 km/s. 

Thus, to leave the solar system, a 
body on the earth’s orbit must have an 
initial velocity (with respect to the 
sun) that is at least 42 km/s. If the ve- 
locity is greater than 42 km/s, the body 
will leave the solar system ; irrespective 
of the direction of this velocity, that 
is, irrespective of whether the body 
moves along the radius away from the 
sun (path 1 in Figure 9.10.1) or along 
a tangent to the earth’s orbit (paths 
2 and 3) or even in the direction of the 
sun (but at a certain angle so as not to 
land on the sun’s surface, path 4). Only 
the shape of the trajectory depends 
on the direction of the initial velocity 
(see Figure 9.10.1). 

It is clear that the most advanta- 
geous trajectory for launching a rocket 
from the earth’s surface is 2: the earth 
moves with a speed of 30 km/s, so that 
only 12 km/s is required to obtain 
42 km/s if we launch the rocket in the 
direction of the earth’s orbital motion. 
This speed v' 2 the rocket must have af- 
ter it has left the earth’s gravitational 
field, that is, after it has traveled a 
distance large compared to the radius 
of the earth’s orbit. 

What initial velocity u 3 must the 
rocket have at the earth’s surface for 



Figure 9.10.1 


this to happen (it is this velocity that 
is called the escape velocity for the 
sun)? It can be found from the obvious 
relationship 

m mgr 0 -(- m — - . (9.10.7) 

Here the first term on the right-hand 
side is the energy necessary for over- 
coming the attraction of the earth, and 
the second term is the energy that the 
rocket (or spaceship) of mass m must 
have after it has overcome the earth’s 
gravity for its speed v 2 to be sufficient 
(together with the 30 km/s of the 
earth’s motion in orbit) for the rocket 
to leave the solar system. Formula 
(9.10.7) then yields 

u\ = 2 gr u + (v' 2 ) z = ul + (9.10.8) 

whence 

u 3 =(u 2 2 + (v' 2 yy/2^ ( 11 . 2 2 + 12 2 ) 1/2 

~ 16.4 km/s. (9.10.9) 

Note that for the rocket to get clos- 
er to the sun or reach, say, Mars or 
Venus, u 2 is insufficient. Indeed, hav- 
ing this velocity, the rocket will leave 
the earth and move along the earth’s 
orbit with the velocity of the earth, or 
30 km/s. 

Although the potential energy does 
diminish as the rocket moves toward 
the sun, but the centrifugal force that 
acts on the rocket moving along the 
orbit prevents the rocket from doing 
this. To get closer to the sun, the rocket 
must diminish its speed, which is as 
difficult as increasing the speed. For 
instance, to hit the sun, the rocket must 
be stopped, that is, it must move with 
a velocity of 30 km/s with respect to 
the earth (after it leaves the earth’s 
gravitational field). For this the rocket 
must have an initial velocity 

u A ~ (30 2 + 11.2 2 ) l/2 ~ 32 km/s 

(9.10.10) 

at the earth’s surface. 

We see that it is more difficult to 
hit the sun than to escape from its 
gravitational bounds. More “profitable” 
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variants of space travel can be obtai- 
ned by using the gravitational pull 
of other planets. However, we will not 
discuss such questions in this book. 

Exercise 

9.10.1. Find the radius of an orbit in which 
a satellite circles the earth every 24 hours. 
(If a satellite is launched into such an orbit 
in the equatorial plane, it will “hang” con- 
stantly over a single point on earth.) 

9.11 Jet Propulsion and Tsiolkovsky’s 
Formula 

In the case of motion in airless space, 
the only method of flight control, 
which is widely used and is highly effec- 
tive, that is, a method that enables 
changing speed and direction, consists 
in ejecting a portion of the mass of the 
flying object itself, which means apply- 
ing the reaction (jet) principle. 

The Russian scientist K. E. Tsiol- 
kovsky (1857-1935) was the first to 
fully realize the significance of the jet 
principle and to investigate the funda- 
mental regularities of reaction , or jet 
propulsion. He was also the first to 
suggest the possibility of conquering 
outer space by means of rockets. From 
him, via his pupils and followers — 
Soviet scientists and engineers — stems 
the scientific tradition that saw final 
embodiment in artificial earth satel- 
lites, space probes, and space sta- 
tions with astronauts on board. 

Let us derive the basic equation of 
the rectilinear motion of a rocket. The 
propellant, whether gunpowder or a 
mixture of fuel (alcohol or gasoline) 
and oxidizer (oxygen or nitric acid), 
possesses a definite supply Q of chemi- 
cal energy per unit mass (of the order 
of 5 X 10 6 J/kg for smokeless powder 
and 10 x 10 6 J/kg for a gasoline (or 
petrol) and oxygen mixture 915 ). In 
burning, this chemical energy is convert- 


9 * 15 The heating value of gasoline is about 

50 X 10 6 J/kg, but burning 1 kg of gasoline 

requires 3.4 kg of oxygen. In a rocket launched 

into airless space, the oxygen has to be carried 

along and the energy must be referred to the 

oum of the masses of fuel and oxidizer. 


ed into the thermal energy of the pro- 
ducts of combustion, which stream out 
of the nozzle, the thermal energy turn- 
ing into the kinetic energy of motion. 

When a reaction (rocket) engine is 
fixed on a test bed, the combustion 
products are exhausted at a definite 
velocity u 0 . The kinetic energy they 
have per unit mass constitutes a definite 
portion of the chemical energy of the 
propellant: 

4 -K = «Q, ( 9 . 11 . 1 ) 

where a is a dimensionless number, the 
efficiency of the processes of combus- 
tion and ejection of gases. 9 * * * * * - 16 From now 
on we will consider the exhaust veloc- 
ity u 0 to be a given known quantity. 
It is roughly 2 km/s for powder and 
about 3 km/s for liquid propellant. It 
is easy to see that these quantities are 
associated with the values of a ~ 
0.5 (an efficiency of the order of 
50%). 

Prior to combustion, the propellant 
was at rest. Suppose a mass dm of 
propellant is burnt and exits from the 
nozzle. In so doing, it acquires a mo- 
mentum equal to u 0 dm. Clearly, the 
impulse dl of the force with which the 
rocket acts on this mass is equal to the 
momentum acquired by the mass, 9 17 
or 

dl = F dt = u 0 dm. 

By Newton’s third law — every action 
has an equal and opposite reaction— 
the impulse of the force with which 
the mass dm of the combustion products 
acts on the rocket vehicle is equal to 
the same quantity with sign reversed. 
Suppose, for instance, that the exhaust 
velocity u 0 is in the direction of de- 
creasing x . Then u 0 is negative, or 
u 0 = — | u 0 | . For the impulse of the 
force acting on the rocket we have 

dI T =F r dt = — u 0 dm = | u 0 | dm. 

(9.11.2) 

9 * 16 If in (9.11.1) Q is expressed in J/kg, 
then u 0 will be expressed in m/s. 

9 * 17 The designation dl is due to the 
fact that we consider a small mass dm. 
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The quantity 

K I < 9 -“- 3 > 

is the impulse per unit mass, what is 
called the unit impulse . This quantity 
is equal to the exhaust speed of gases 
from a rocket at rest. 

Let us check the dimensions in for- 
mula (9.11.3). The force F has the di- 
mensions of kg* m/s 2 , or N, and the 
impulse I is the product of force by 
time, so its dimensions are N*s (or 
kg* m/s). The dimensions of dlldm are 
(kg*m/s)/kg — mis , which are the di- 
mensions of velocity. For powder 
gases, u 0 = 2 X 10 3 m/s = 2 km/s, 
while for liquid fuel u 0 = 3 km/s. 

The force acting on the rocket is, by 
formula (9.11.2), 

F r = \ u o\ -JT • 

It is proportional to the quantity of 
gases exhausted in unit time. 

Now let us examine the derivation of 
the formula for the velocity of the ro- 
cket vehicle. If the rocket is itself in 
motion with a velocity u , then the ex- 
haust velocity of the gases differs from 
u 0 and is equal to u + u 0 = u — | u 0 | 
(recall that when the vehicle is at rest, 
the exhaust velocity of gases is equal 
to — | u 0 | ). It is obvious that such 
quantities as the difference between the 
velocity of powder prior to combus- 
tion and the velocity of the exhausted 
powder gases and as the force with 
which the powder gases act on the rock- 
et are independent of whether the 
rocket vehicle is in motion or at rest 
(for the sake of definiteness we speak of 
powder as the propellant). 

Let us denote the initial mass of the 
rocket together with the powder by 
M 0 and the mass of exhausted powder 
gases by m. The quantity m is a func- 
tion of time, or m = m (t). The mass of 
the rocket with powder at time t is 
equal to 

M = M (t) = M 0 - m (t). (9.11.4) 


The equation of motion, or Newton’^ 
second law, is 

T. du • i dm, 

M nr =F =\ u «\iu' 

which can be rewritten as M du 
| u 0 | dm , or, using (9.11.4), 

(M 0 -m)^=\u 0 \. (9.11.5)- 

The possibility of cancelling out dt 
has the physical meaning that, in the* 
absence of other forces acting on the 
rocket, the velocity of the rocket de- 
pends only on the amount of exhausted 
powder gases (for a fixed value of u 0 ). 
By the time a given amount m of pow- 
der gases is exhausted through the noz- 
zle, the rocket has acquired a definite 
velocity u , irrespective of the time dur- 
ing which the given amount of powder 
gases was released. 

It is easy to solve the differential 
equation (9.11.5). At t = 0, u = 0 
and /Ti — O. We thus have 


m r 



= — Kl In (M 0 — m)\™* 

= | iio | [ — In (M 0 — m x ) + In M 0 ] 


where m x is the total mass of powder ga- 
ses exhausted by a given time, and 
M — M 0 — m x is the mass of the rock- 
et at this moment in time. We have 
therefore arrived at Tsiolkovsky's for- 
mula 

u=|uo|ln-^L. (9.11.6) 

If we are interested in the terminal 
velocity u ter at burnout, in formula 
(9.11.6) we must substitute M ter (ter- 
minal mass of the rocket after all the 
fuel has burnt out) for M, that is, 
M ter = M 0 — m toU with m tot the 
total mass of the propellant. We get 

M ter= l«ol ln l • (9.11.7) 
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This formula can easily be used to 
.solve the inverse problem: what initial 
mass M 0 of the rocket vehicle must 
be taken so that to a given terminal 
mass M ter is imparted a definite vel- 
ocity W-ter • 

1 r< M o — ^ter 
Mter " K( ’ 

whence 

M 0 = M ter exp -jjfj- . (9.11.8) 

Everywhere in our analysis we ig- 
nored the (transient) process of “slow” 
combustion, when both the force of grav- 
ity and the reaction force act on the 
body (rocket). Of course, in real life 
this situation is always present — we 
only hoped that by separating these 
two forces we do not introduce a large 
error into the results; the forces of grav- 
ity were studied in Section 9.10, 
while jet propulsion has been studied 
in the present section. 918 We will now 
try to unite the results of these two sec- 
tions by using formulas (9.11.6)- 
(9.11.8) to estimate the values of 
M 0 /M ier necessary for attaining the 
satellite velocity u x and the escape vel- 
ocities u 2 and u 3 . We will again as- 
sume that the propellant is gun powder, 
for which | u 0 | = 2 km/s. Using for- 
mula "(y'.lO), we get = 

e 4 ~ 54 for u x = 8 km/s, M 0 /M ier 2 = 
e 5 - 6 ~ 270 for u 2 = 11.2 km/s, and 
M 0 /M ter 3 = e 8 - 2 ~ 3640 for u 3 = 
16.4 km/s. In the case of liquid pro- 
pellant, | u 0 | = 3 km/s, and similar 
calculations yield M 0 lM terl 14.5, 

M 0 /M ter 2 ~ 42, and M 0 /M ier 3 ~ 245. 
From the foregoing we see that the mag- 
nitude of MjM te r is strongly depen- 
dent on the exhaust velocity of the 


9 - 18 Note that such a “coarse” approach to 
a real process was undertaken in Section 9.10, 
when we separated into two stages the laun- 
ching of an object into outer space: in the 
first stage we assumed the acceleration of grav- 
ity g to be constant and equal to g 0 , the 
acceleration of gravity at the earth’s surface, 
while in the second stage the object moved 
in a mode in which g diminished as the dis- 
tance to the earth’s center increased. 


gases, u 0 . To get an idea of the difficulty 
of the problem of launching a rocket, 
one should bear in mind that M ter 
includes the mass of the fuel tanks, 
etc. 

Let us find the efficiency of a rocket 
as a whole. We define this quantity as 
the ratio of the kinetic energy of the 
rocket at burnout, M ter ul er /2 , to the 
chemical energy of the burnt fuel, 
mQ = (M 0 — M ter ) Q. The efficiency 
is 


n Mj eT uter 

1 2Q (M 0 -Mter) # 


(9.11.9) 


Substituting the expression for u ter 
from (9.11.7) and expressing u\ from 
(9.11.1), we finally obtain 


n 


— a 


Mte r 

Mq Mter 


(in 


Mp \2 
Mter ) 


The efficiency proves to be the product 
of the “internal efficiency” a (which 
characterizes the completeness of burn- 
ing of the fuel and the conversion of 
thermal energy into the kinetic energy 
of the gases) and a second factor that 
depends solely on the choice of the ra- 
tio between the mass of propellant, m, 
and the mass M ter of the payload. We 
denote ml Mter by 2 * Then M 0 = M ter + 
m = M X er (1 + 2 ), and 

npa-gjffl-hn M ^+ m \ 2 

' m \ Mter J 

=- -J- [In (1 + z)] 2 . (9.11.10) 


At first glance it might appear that 
due to the fraction 1 /z the efficiency is 
very great for small z’s. In reality, for 
small z’s we have In (1 + z) ~ z (see 
Section 4.9) and so r\ ~ (a Jz) z 2 = 
a z for z< Cl. The efficiency is pro- 
portional to z and, hence, is small for 
small z’s, with the result that the rocket 
moves slowly and almost all the energy 
is carried away by the gases. For very 
great z’s, the efficiency again falls be- 
cause of diminished payload mass. 919 

9 . 19 For large z’s, the quantity [In (1 + z)] 2 
grows more slowly than z. Indeed, denoting 
y = In (1 + z), we get z = eV — 1 and the 
function grows faster than any power y (see 
Section 6.5). 
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Since the terminal velocity of the rocket 
is also dependent solely on z, we can 
say that the efficiency of the rocket is 
determined by the requisite velocity. 
At small velocities the efficiency of the 
rocket vehicle is low and so it is 
disadvantageous to employ jet propul- 
sion in automobiles and other cases of 
relatively slow motion. At high veloc- 
ities, the energy efficiency of the rock- 
et again diminishes, but the use of 
the rocket is nevertheless justified since 
we do not possess any other means of 
accelerating bodies to high velocities. 

In conclusion let us find the value of 
z = m/M ter which yields maximum 
efficiency r\ and the magnitude of this 
maximum. In view of (9.11.10), the 
problem reduces to determining the 
maximum of the function F (z) = z -1 
[In (1 + z)] 2 , that is, to solving the 
equation 

rv /_\ 2? In (l + z)j] [In (l+z)l 2 0 

F( Z )— (i+Z)z Z 2 

or 

ln(l+*) = -ifr. (9.11.11) 

The simplest way of solving this equa- 
tion is by graph. To do this, we deter- 
mine the point of intersection of the 
curves f x = In (1 + z) and / 2 (z) = 
2z/(l + z) (curves I\ and T 2 in 
Figure 9.11.1). This yields z~ 4, whence 
F (z) ~ 0.65 and rj max ^ 0.65a /(it 
is easy to verify that z ~ 4 corresponds 
to a maximum of F(z) and not to a 
minimum). 

Exercise 

9.11.1. Prove that the value z ~ 4 cor- 
responds to the maximum of the function 
F ( z ) = z _1 [In (1 + z)] 2 . 


9.12 The Mass, Center of Gravity, 
and Moment of Inertia of a Rod 

We consider a thin rod. The x axis will 
lie along the rod. Denote by a the mass 
per unit length of rod. Thus, on a por- 
tion dx between x and x + dx there 
will be a mass 

dm — a dx . (9.12.1) 

The quantity a (g/cm) is the product 
of the volume density d (g/cm 3 ) of the 
rod’s material and the cross-sectional 
area S (cm 2 ) of the rod, or a = Sd . 
The rod may have a cross-sectional area 
and a density that depend on x, so 
that a is a function of x, or cr = a (x). 
The quantity o' should be called the 
linear density or the density per unit 
length , but since the real density d 
(volume density) does not enter into 
the subsequent computations, we will 
call o, for short, the density . We consid- 
er the thickness of the rod to be small 
and depict it merely as a straight line, 
a line segment of the x axis. The total 
mass of the rod is clearly 

b 

m -= j o(x)dx , (9.12.2) 

a 

where a and b are the coordinates of the 
ends of the rod. 

Let the rod be fixed on the x axis, 
which is assumed to be a rigid rod so 
thin and of so small a mass that it can 
be considered a massless line; the x 
axis is horizontal, while the y axis is 
directed vertically upward. The force 
of gravity acts on the rod (as indicated 
in Figure 9.12.1) tending to pull the 
rod downward. 

Imagine that the x axis is a weight 
lever. Indicated schematically in the 
drawing is a prism supporting the x 
axis at the origin. The x axis can thus 
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rotate about the axis perpendicular to 
the plane of the drawing. Let us find 
the load M on the left at a distance R 
along the x axis that is needed to 
balance the rod on the right. 

By the laws of a lever, the element of 
mass dm distant x to the right of the 
support 0 is balanced by element of 
mass dM to the left if the masses are 
inversely proportional to the distances , 
that is, if 

, or R dM = x dm. (9.12.3) 

The element of mass dm is equal (as 
we learned above) to a dx. To balance 
the entire rod we need a mass M that 
satisfies the equation 

b 

RM = j xg (x) dx. (9.12.4) 

a 

This equation is the result of inte- 
grating the left and right members of 
(9.12.3). To the right of the axis, differ- 
ent elements of mass dm are located 
at different distances x from the sup- 
port. This is why the quantity xg (x) 
appears under the integral sign. (In 
this way the integral determining the 
mass M that balances the rod differs 
from the integral determining the rod’s 
mass m.) To the left of support 0, all 
elements of mass dM (which balance the 
distinct elements dm of the rod) are 
collected together at the same distance 
R from the support. R is a constant and 

so j R dM = R j dM = RM. 

The question now is: if we concentrate 
the entire mass of the rod, m, at one 
point C, then at what distance x c must 
this point be from the support (from 
the origin, that is) in order to balance 
the mass M at distance R that is bal- 
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anced by the rod (Figure 9.12.2)? 
We find that 


RM = x c m— j XGdx, (9.12.5) 

a 

whence 

b b b 

x c = — j xo dx = j xg dx j J a dx. 

a a a 

(9.12.6) 


The quantity x c is the coordinate of 
the center of gravity or, as it is also 
called, the center of mass of the rod, C. 
It is highly important that point C 
is indeed a definite point of the rod: if 
we displace the rod as a whole along the 
x axis, say, a distance l to the right 
(Figure 9.12.3), x c will also increase by 
the same quantity Z, so that the point 
C with coordinate x = x c is always 
(for a given rod) at a very definite dis- 
tance from the endpoints of the rod. 
We will prove this. 

Let us consider a rod displaced a dis- 
tance l to the right from the original 
position (Figure 9.12.3&); the center of 
gravity of the shifted rod will be de- 
noted by C x . Then 


x c = 




a 


dx 


(9.12.7) 
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= -2 1 g =zx c -\-l, 

^ a dx ^ a dx 

a a 

which constitutes a result that is obvi- 
ous from the start. 

The most convenient thing is to choose 
the system of coordinates with ori- 
gin at the center of gravity of the rod 
(Figure 9.12.4). The quantities in this 
coordinate system will be denoted by 
a zero subscript. It is clear that 

bo 

^ o 0 (x) dx=-m. 

a 0 

The coordinate of the center of gravity 
x Co is zero in this system and so 
bo 

j xo 0 (x)dx= 0. (9.12.8) 

do 


the field of gravity is equal to the po- 
tential energy of its entire mass con- 
centrated at the center of gravity of the 
rod. We consider the position of the 
rod as indicated in Figure 9.12.5. The 
potential energy du of an element of rod 
with mass dm is equal to gz dm , where 
z is the altitude and g is the accelera- 
tion of gravity. The potential energy u 
of the whole rod is found by integrating. 
For the variable of integration we choose 
a length reckoned along the rod from 
its center of gravity; the density at 
point x is denoted by {x). We express 
the altitude z in terms of x. As is evi- 
dent from Figure 9.12.5, z {x) = z c + 
x cos a, where z c is the height of the 
center of gravity of the rod. We get 

bo bo 

j gz o 0 (x) dx = g j* (z c -\-x cosa) o 0 (x) dx 

do &o 


In other words, if we wish to balance 
a rod supported at the center of gravi- 
ty by a load M situated at a fixed dis- 
tance R from the support, we will find 
that the mass required is zero (cf. 
(9.12.4) and (9.12.8)): if the point of 
support coincides with the center of 
gravity of a rod, the rod is balanced 
without applying any additional load. 

We will now show that for any posi- 
tion of the rod its potential energy in 



Figure 9.12.4 


b o bp 

= gz c j a 0 (x) dx + g cos a j xo 0 (x) dx 

do CLo 

= gx c nt 9 

since the second integral is equal to 
zero by formula (9.12.8). Thus, the po- 
tential energy depends only on the 
mass of the rod and the height of the 
rod’s center of gravity but does not de- 
pend on the angle between the rod and 
the horizontal: rotation of the rod about 
its center of gravity requires no energy; 
a rod fixed by a hinge at the center of 
gravity is capable of freely rotating about 
the support without release or expen- 
diture of energy. 

Now let us investigate the concept of 
the moment of inertia of a rod. This con- 
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Figure 9.12.6 


Thus, E — 7co 2 /2, or the kinetic energy 
of rotation is expressed in terms of the 
moment of inertia and the angular ve- 
locity in exactly the same way that 
the kinetic energy of simple translation 
is expressed in terms of mass and linear 
velocity, E = mv 2 / 2. Note also that 
the moment of inertia (9.12.9) is al- 
ways positive (as the mass m is). 


cept comes up in a consideration of the 
rotational motion of a rod. Let a rod 
be in rotation about an axis perpendi- 
cular to the plane of the drawing and 
passing through the origin. Then each 
point of the rod will describe a circle 
centered at the origin and having a ra- 
dius equal to the abscissa x of the given 
point in the initial (horizontal) posi- 
tion of the rod (Figure 9.12.6). Denote 
by co the angular velocity of rotation 
expressed in radians per second. This 
means that during time dt the rod ro- 
tates through the angle dcp = c odt. The 
arc length traversed by an arbitrarily 
chosen point with abscissa x is dl = 
xdi p = x(x) dt\ hence the linear ve- 
locity of motion of a point on the rod 
with abscissa x is v (x) = dlldt = (ox. 

Let us find the kinetic energy of rota- 
tion of the whole rod. An element of 
mass dm distant x from the center of 
rotation (i.e. from the origin; here we 
are speaking of a segment of the rod of 
length dx with endpoints at x and 
x + dx) has kinetic energy 


v 2 , 

IT 



G ) 2 # 2 

2 


o (x) dx. 


Hence, the kinetic energy of the whole 
rod is 

b 

j x 2 o ( x ) dx. 

a 

The integral in this formula is called 
the moment of inertia of the rod about 
the axis passing through the origin and 
is symbolized by /, or 

b 

/= j x 2 a(x)dx . (9.12.9) 

a 


If we are dealing with a system of point 
masses m 1? m 2 . . ., m k lying at points M ± , 
M 2 , . . ., M k in a plane, the moment of iner- 
tia of this system about a straight line l is 

2m^ = m 1 df-f-m 2 d2+ • • • (9.12.10) 

of the products of mass m* by the square of 
the distance from point Mi to Z, or d\ (here 
i = 1 , 2, ...,&). If we have a body (a plate 
or a rod, as in our case) instead of a system of 
point-like masses, we are dealing with a con- 
tinuous distribution of mass and the moment 
of inertia is approximately equal to the mo- 
ment of inertia of a system of point masses ob- 
tained through partitioning the body into 
many little parts and replacing each part by 
a point mass equal to the mass of the part. 
It is clear that this definition, where it is as- 
sumed that the number of these parts tends to 
infinity while the size of each part tends to 
zero, leads to a formula for the moment of 
inertia in the form of an integral (cf. formu- 
la (9. 12.9)).° - 20 A sum similar to (9.12.10) 
but with distances di between points Mi and 
the straight line l substituted for the squares 
of such distances, d\, is known as the static 
moment of a system of masses about Z: 

^P J midi = m 1 d 1 -\-m 2 d 2 + . . (9.12.10a) 

Here, however, we must allow for the sign of 
di , depending on what side of Z the particu- 
lar point Mi lies. It is clear, that the transi- 
tion from a system of point-like masses to a con- 
tinuous body leads in this case, too, to an in- 
tegral replacing the sum; for instance, the sta- 
b 

tic moment j* xg (x) dx of a rod about point 0 

a 

is present in formula (9.12.6) for the center of 
mass of a rod. 

We now take up the evaluation of 7. 
For a rod whose center of gravity lies 
at the origin 0 (the center of rotation), 


9 - 20 In the case of aflat plate or a solid 
we have to deal with a multiple integral (doub- 
le or triple ) over the area of the plate or the 
volume of the solid. In this book we will not 
consider multiple integrals, however. 
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the moment of inertia assumes the 
value I 0 : 

b o 

I 0 = ^ x 2 u 0 (x)dx. (9.12.11) 

Let us determine the moment of iner- 
tia of a rod for the case where the cen- 
ter of gravity C 1 is distant l rightward 
from the origin, so that x Cl — 
l. In this case a = a 0 + Z, b = 
b 0 + Z, cr (x) = (? 0 ( x — and ^ = 

b 

x 2 a (x) dx. If we set z = x — l, 

a 

then x = z + Z and dx = dz. When x 
varies from a to 6, the quantity z 
varies from a 0 to b 0 . Therefore, 

b 0 b o 

I = j (z -f- Z ) 2 a 0 (z) d:- = l 2 j a 0 (z) dz 

&0 Go 

bo b o 

+ 21 j a 0 (z) z dz + j a 0 (z) z 2 dz. 

do &0 

(9.12.12) 

b 0 

Note that j a 0 (z) dz = m, while the 

do 

second integral on the right-hand side 
of (9.12.12) is zero by formula (9.12.8). 
Finally, the third integral in (9.12.12) 
is / 0 by (9.12.11). 

Thus, formula (9.12.12) assumes the 
form 

r = ml 2 + / 0 . (9.12.13) 

The quantity ml 2 is clearly the moment 
of inertia of a point mass m distant l 
from the axis of rotation (from the ori- 
gin). Thus, the moment of inertia of a 
rod about an arbitrary axis perpendic- 
ular to the rod is equal to the sum of 
the moment of inertia of the rod about 
the parallel axis passing through the 
center of gravity plus the moment of 
inertia of a mass equal to the rod mass 
about an axis parallel to the first two 
axes and separated from this mass by 
a distance equal to the distance of the 
center of gravity of the rod from the ar- 
bitrary axis. 



We can picture a rod hinged at the 
center of gravity. Rotation of the axis 
and the hinge need not be accompanied 
by rotation of the rod, that is, we can 
visualize a motion with successive 
stages as indicated in Figure 9.12.7. The 
kinetic energy E' of such motion is 
mvcj 2 , with u Ci the velocity of the 
center of gravity of the rod. But 
v Cx = co Z, so that E' = (co 2 /2) ml 2 . 

The motion we considered earlier 
(see Figure 9.12.6) differs from that of 
Figure 9.12.7 in that in the former case 
the rod itself was in rotation with an 
angular velocity co about its center of 
gravity. For this reason, the kinetic 
energy of rotation in Figure 9.12.6 
proves to be equal to the sum of the 
energy of rotation of the type in Fig- 
ure 9.12.7 and of the energy of rotation 
about the center of gravity, which is 
equal to 7 0 co 2 /2. 

It is evident from the derivation of 
the formula that such a simple addition 
of energies in the combination of two 
motions only results when we consider 
the motion of the center of gravity; 
only then do we find the integral 
(9.12.8) to be equal to zero. 

We can approach the concept of the center 
of gravity of a rod from another angle. From 
elementary physics it is well known that the 
resultant of two parallel (and pointing in the 
same direction) forces / x and / 2 applied at 
points M x and M 2 , respectively, is a force + 
f 2 applied at the point C that divides the seg- 
ment M 1 M 2 in the ratio M X C -i- CM 2 — f 2 
E (Figure 9.12.8a). If /x and / 2 are forces of 
gravity acting on masses m 1 and m 2 , than point 
C (such that M X C CM 2 = / 2 H- h — m 2 -f- 
mj) is called the center of gravity of the two 
masses; if, in addition, we take the straight 
line M x M 2 as the abscissa axis and assume 
the abscissas of points and M 2 to be x x 
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and x 2 , then abscissa xq of point C is (; m x x x + 
m 2 x 2 )/(mi + m 2 ). Indeed, then 


M X C + CM 2 


m x x x — |— m 2 x 2 

m x x x —|— m 2 x 2 

Xl m x + m 2 

m x — j— tt 2 2 


772 2 (x x — x 2 ) 


772 1 (X X — X 2 ) 

772 1 -f- 772 2 


772 1 -j- 772 2 


- m 2 - mi. 


Similarly, if we have three masses m lf m 2 , 
and m 3 lying on a single straight line at points 
M 1 (sq), M 2 (x 2 ), and M s (x 3 ) (Figure 9.12.86), 
the resultant of the forces of gravity applied 
to the first two masses is the force applied to a 

mass m 1 +m 2 lying at point C 12 / m i x i+ m 2 x 2 \ _ 

\ / 

The resultant of all three forces will be 
applied to the center of gravity of the masses 
m x + m 2 and m 3 , that is, to the point C 
with the abscissa 


*C = [(rox + TOjj) 


m ± xi-{-m 2 x 2 

m 1 -\-m 2 


" m 3 x 3 


j[(m l 


+ TO2)+m 3 ] 


77i i a? i m 2 x 2 — |— m 3 x 3 

ITl j — j— 777- 2 — J— 772 3 


Quite similarly we can prove that the resul- 
tant of the forces of gravity of a system of 
masses m x , m 2 , . . .,77i fe lying at points M x (a:!), 
M 2 (x 2 ), . . ., M k (x k ) is applied at the point 


/ tn x x x -f- m 2 x 2 -f- . 

■• + m k x h \ 

( \ 

\ 772! + m 2 + . . 

•+»h / | 

^ / 


(9.12.14) 

(see Exercise 9.12.4a). 

Reasoning along the same lines, we can 
establish that if k masses m ± , m 2 , . . ., m k 
lie in space at points M x (x x , y x , z x ), M 2 (x 2 , 
y 2 , z 2 ), . . ., M k (x h , y h , z k ), the resultant of 
all the forces of gravity is a force applied to 
the mass m x + m 2 + ... -f- m k at the center 
of gravity of the system, a point C ( xc , yc, %c) 


with coordinates 


2 m i x i 

X C = , yc= Y n, ’ Z ° = V m 

2j m i / ■ m i / i m j 

i i i 


(9.12.15) 

(see Exercise 9.12.4b). 

Now let us take a rod with a linear density 
a = a (z). If we partition the rod by points 
Xq = a, x ly x 2J .. M %n — b into n small 
parts with masses m% — a (x{) A xi, where 
A xi — Xi — xi_ x and i = 1, 2, . . ., 72, and 
then consider n point-like masses mt corres- 
ponding to the 72 parts of the rod (each part is 
assumed homogeneous since within each A xi 
the density changes little) and concentrated 
at the endpoints of each part, x — xt, the re- 
sultant force of gravity for these point masses 
will be applied to the center of gravity C x (xa), 
where xc x = ( x i) x i (%i) Ax*. Send- 

i i 

ing 72 to infinity and all the A xi to zero and 
replacing the sums with integrals, we arrive 
at formula (9.12.6) for the center of gravity 
of the rod. 


Exercises 

9.12.1. Find the moment of inertia about 
the center of gravity of a rod of length l with 
a uniform distribution of mass. 

9.12.2. A rod is made up of two pieces: 
one piece of length l x has a constant density a lf 
and the other one of length l 2 has a constant but 
different density cr 2 . Find the position of the 
center of gravity of the rod. 

9.12.3. Find the position of the center of 
gravity and the magnitude of the moment of 
inertia about the center of gravity of a rod 
in the form of a thin triangle of length L. 
Express these quantities in terms of the length 
L and the mass m of the rod. [Hint. If the x 
axis lies along a median and the origin is cho- 
sen at the respective vertex of the triangle, 
then g (x) = ax , with a a constant.] 

9.12.4. Prove (a) formula (9.12.14) and 
(b) formula (9.12.15). 


9.13* Centers of Gravity of a String 
and of a Plate 

In Section 7.11 we encountered the no- 
tions of the center of gravity of a flat 
plate and of a string that was bent in an 
arbitrary manner. Suppose AB, with 
A — A (%, b x ) and B — B (a 2 , b 2 ) , is a 
string of arbitrary shape of linear den- 
sity a ( x , y) (Figure 9.13.1). In other 
words, we assume that a small sec- 
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tion of the string of length ds = 
V [x'Y + (y') 2 dt, with x = x (t) and 
y — y (t) the parametric equations of 
AB, has a mass dm = a ( x , y)ds, where 
(x, y) is a point within the considered 
section, say, its end. If we replace the 
continuous distribution of mass, or 
our string, with a system of n point- 
like masses dm t = g (x t , y t ) ds t ( i = 
1, 2, . . ., ft), each corresponding to 
the section ds t of the string with e nd- 
points ^4 (xi and Ai(x t , y t ), 

where A 0 = A , A l9 A 2 , . . A n = B 
are the points that partition the string 
into ft small parts, we conclude that 
the center of gravity C 1 of these point- 
like masses has the following coordi- 
nates: 

X[G fait y i) dsi 

X Ci = -sn * 

Zj o fa t , y t ) dsi 
i 

2 yi& fait yi) dsi 

y Cl =-±- (9.13.1) 

Zj G fail yfj dsi 
i 

(cf. (9.12.14) and (9.12.15)). Going over 
to the limit n oo, ds t 0, we arrive 
at the following expressions for the 
coordinates x c and y c of the center of 
gravity C of the string: 

B 

§ xa fa , y) ds 

A 

X C — ~ '» 

J G fa, y) ds 

A 

B 

l yG fa, y) ds 

; (9.13.2) 

$ Gfa , y) ds 

A 


B 

here j stands for integration along the 

A 

curve AB. If x = x (t) and y = y (t) 
are the parametric equations of AB, 
and t 1 and t 2 correspond to the endpoints 
A and B, then along this string we have 
a = a (x(t), y(t)) = o(t), the total mass 
of the string is 

m = j a (t) Y (x’) 2 + (y 1 ) 2 dt, 

ti 

and the coordinates of the center of 
gravity are 

x c = -^ \ x ( t )° (t) V ( x ' ) 2 + ( y ') 2 dt > 

u 

t, 

vc = 4 " 5 y (t)°(t)V( x ') 2 +(y') 2 dt- 

ti 

(9.13.3a) 

If the form of the string AB is given 
explicitly, 9 - 21 y = y (x), with a^. x^, 
b , then o = o (x, y (x)) = a (x), the 
mass is 

b 

m = j a (x)V 1 + (y') z dx , 

a 

and 

b 

x c = —■ j xa (x) V l + (y') 2 dx , 

a 

b 

y c = 4" J y (*) a (*) V 1 + {y') 2 dx - 

(9.13.3b) 

In the particular case of a uniform 
string of constant density, o = con- 
stant, we find that m = aS, where 

b 

S — j ds is the length of the string, 
whereby formulas (9.13.2), (9.13.3a), 

9 - 21 Here we assume that each value of x 
corresponds to a single point of the string; if 
this is not so, we can simply break up the 
string into several sections in each of which 
the above condition is satisfied. 
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and (9.13.3b) can be rewritten thus: 

B B 

x c = -J- j xds, y c = -j- ^ yds, (9.13.4) 

A A 

or 

t* 

xc = -j- j x (t) ywv+W ) 2 dt > 

u 

yc= -y \ y (t)V (x ') 2 + (y') 2 dt, 

tl 

(9.13.5a) 

and 

b 

x c = -J- j xY 14- (y') 2 dx, 

a 

b 

y c = -j- j y(x)V i + (y') 2 dx. 

(9.13.5b) 

Note that all formulas for the centers of 
gravity contain the static moments of 
the string about axes Ox and Oy (see 
p. 340). 

The situation is somewhat more com- 
plicated when we deal with a plate 
D (Figure 9.13.2; the thickness of the 
plate can be assumed so small that it 
can be ignored). Actually, the problem 
of finding the center of gravity of a 
plate is reduced to evaluating certain 
double integrals over the area of the 
plate, which integrals are not discussed 
(or even formulated) in this book. How- 
ever, there is a certain method that 
makes it possible to overcome this diffi- 
culty. 



Figure 9.13.2 


We denote by p = p ( x , y) the density 
of the plate per unit surface area; in 
other words, we will assume that the 
mass of a small rectangle with sides 
dx and dy , with dx and dy very small 
(see Figure 9.13.2; the sides of the rect- 
angle are taken parallel to the coor- 
dinate axes), is equal to p (, x 0 , y 0 ) dxdy, 
where dxdy is the area of the rectangle, 
obviously, and p (x 0 , y 0 ) is the density 
at a point ( x 0 , y 0 ) of the rectangle, say, 
at one of its vertices (since the rectangle 
is small, the density changes but little 
within its limits). Next we “cut” the plate 
into n vertical strips by the straight 
lines x = x 0 = a, x = x x , x = x 2 , . . . 

. . ., x = x n = 6, the width of each 
strip being A^ = x t — Xi^(i =1,2, . . . 

. . ., ft); here a and b are the smallest 
and greatest of the abscissas within 
plate D (Figure 9.13.2). We may also 
assume that the ith strip (or the £th rod) 
has a linear density that depends on 

y* 

Qi {y) = P (Xi, y) A Xi, (9.13.6) 

and a mass 
f(x t ) 

Am* = \ [p (x t , y) Ax t ] dy 
_ i{Xi) 

= [ ( p(x t , y)dy^Ax t , (9.13.7) 

g(X-) 

where y — g (x t ) and y = f {x t ) are 
the smallest and greatest values of the 
ordinates y of the points of the ith rod; 
precisely, (x t , g {x t )) and (x t , f (**)) are 
points of intersection of the straight 
line x = x t with the boundary 9 * 22 of 
plate Z>. (Note that the integrand in the 
term on the right-hand side of (9.13.7) 
is a function of a single variable y, 
since the value of X\ is fixed.) This mass 
A m t distributed “over” the rod can be 
replaced with a point-like mass A m t 
concentrated at the center of gravity 

9.22 p or sa h e 0 f simplicity we assume 
that plate D has the “oval” shape as depicted 
in Figure 9.13.2, that is, is restricted by two 
simple curves y = g (x) and y = / (x) taken 
within the limits x = a and x = b. 
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of the rod at point Ci(xa, ya ) with 
coordinates 


x Ci =Xi, 

/(*P f(x.) 

yc i = i yp( x n y) d y / j p (x u y)dy 

g(0C.) g(x t ) 

(9.13.8) 

(cf. (9.13.6) and (9.12.6)); 9 - 23 note that 
the factor Ax t in the density (9.13.6) 
cancels out in the fraction in the expres- 
sion for the ordinate y Ci of the center 
of gravity C t . 

Thus, we have replaced the force of 
gravity acting on the plate with n 
forces acting on the masses A m x , 
A m 2 , . . ., A m n (cf. (9.13.7)) at points 
C 2 , . . C n (see (9.13.8)). What re- 
mains to be done is to find the resultant 
of these forces applied to the center of 
gravity of the system of rods considered 
here and then (as usual) to pass from 
sums to integrals via the limiting pro- 
cess n oo, A x t 0 (for all i’s from 

1 to ri). 

The derivation of the corresponding 
formulas poses no difficulties (see Exer- 
cise 9.13.2), but still it lies outside the 
scope of this book, since it involves 
evaluating iterated integrals, that is, 
integrals that contain functions express- 
ed in terms of other integrals. For this 
reason we will not clutter our exposi- 
tion with such formulas. Instead, we con- 
sider the case of a homogeneous plate 
with constant density p = constant, 
which of course can be set at unity: 
p (x, y) = 1. Then a rod of width 


9 • 23 Formula (9.12.6) refers to the case of 
a horizontal rod, but already from the definition 
of the center of gravity of a rod as a point C 
such that when the rod is fixed at this point, 
it is in equilibrium (see p. 339), follows the 
equal status of the x and y axes (one is trans- 
formed into the other if we rotate the rod 
through 90° about point C ) , whereby the second 
formula in (9.13.8) follows directly from 
(9.12.6). 


A x t possesses a mass 
/(**) 

Anti = ^ dy Ax t = [f (x t ) — g (z*)] Ax it 

g(.x.) 

(9.13.9) 


which coincides with the area of the 
rod (the product of its height / {x t ) — 
g (Xj) by the width Ax t ), and the 
center of gravity C t lies, naturally, in 
the midpoint: 


x c i ~ x i » yc.= 


/ (Xj) + g(Xj) 
2 


(9.13.10) 


The sum of all masses (9.13.9) is 


m = 2 = 2 [f (z,-) — g ( x t )] A xi . 

i i 


Going over to the limit n -»■ oo 
A x t -> 0, we get 

6 

M= j [f{x)—g(x)\ dx, 

a 

which is obviously the area S of plate 
D (cf. Section 7.5). The coordinates of 
the center of gravity C' of the strips 
(rods) with the masses (9.13.9) concen- 
trated at points (9.3.10) are 


Xc = S x i If ( x t) —g (Xi)) A xjs, 

i 

y C - = 2 [/ (Xi)-g (Xi ) ] A x ( /s 

% 

(cf. (9.12.14) and (9.12.15)), or 
x C ’ = 2 x t [f (z,) — g (z ; )] A xJS, 

i 

Vc=\ 2 {[/ (^f)J 2 — (^i)] 2 } Aorf/^. 


The same limiting process ra oo T 
Ax 0 leads to the following (exact) 
formulas for the coordinates x c and 
y c of the center of gravity C of the ho- 
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mogeneous plate D with surface area 
S: 

b 

*c= 4" j x M( x )— g ( x )] dx » 

a 

b 

Vc = ls~ j {[/W] 2 — t g{x)?}dx, 

a 

(9.13.11) 

which we will now write out in full for 
the special case of a curvilinear trape- 
zoid D = ABCD , which is formed by 
the x axis, the straight lines x = a 
and x = 6, and the curve y = / (x), 
that is, for the case with g (x) = 0: 

b 

X C — y j xf ( X ) dx, 

a 

b 

*/c = ^s\[1{x)?dx, (9.13.12) 

a 

where, as we already know, S — 

b 

j / (x) dx. 


Exercises 

9.13.1. Find the center of gravity of a ho- 
mogeneous plate that is (a) a triangle, (b) a 
trapezoid, and (c) a half-disk. 

9.13.2. How can we find the coordinates 
xc and yc of the center of gravity C of a plate 
D of density p = p (x, y) limited by the curves 
y — g ( x ) and y = / (x) (with a c x c 6)? 

9.14 The Motion of a Body in a 
Medium that Resists this Motion with 
a Force Dependent Solely on 
the Velocity 

When in motion, every body experi- 
ences a counteraction from the medium 
in which the motion is taking place. 
If the resistance is slight, say, in the 
motion in air with a low velocity, it 
can often be neglected. However, in 
some cases this approach is not satisfac- 
tory and the resistance has to be ta- 
ken into account. 


It has been established in experi- 
ments that if a body is moving in a 
liquid or a gas and the speed is low and 
the body is small, the force of resistance 
is proportional to the speed : 

F ( t ) = —kv (t). (9.14.1) 

Here the coefficient of proportionality, 
k , is positive, and the minus sign shows 
that the force of resistance is in oppo- 
sition to the velocity of the body. The 
number k depends on the properties of 
the medium and is proportional to the 
viscosity of the medium. 9 - 24 In addition, 
k is dependent on the shape and dimen- 
sions of the body. For example, if the 
body is a ball of radius R , formula 
(9.14.1) assumes the form of the 
Stokes law: 9 2b 

F = — (£), (9.14.2) 

where r) is the viscosity of the medium. 
For air, rj = 1.8 X 10~ 4 g/cm*s, while 
for water at 20 °C, rj — 0.01 g/cm-s. 

We consider the problem of decelera- 
tion of a body. Suppose that some force 
has imparted a velocity to a body 
and at time t = t 0 has ceased to act. 
The body continues to move and is act- 
ed upon by the force of resistance 
alone. 


9 ’ 24 Viscosity rj can be defined as follows. 
Let a liquid (or gas) be in motion along the x 
axis, and the velocities of the various parti- 
cles be distinct and dependent on the y coor- 
dinate. It is clear that a solid could not move 
in that fashion and would be destroyed. In 
a liquid or a gas, in this case there arises, 
between adjacent layers, a force of friction 
proportional to the difference in the velocities 
of the adjacent layers, that is, to the deriva- 
tive dvldy. The proportionality constant in the 
expression for force / per square centimeter 
of horizontal surface area is called the viscosity 
f = q ( dvldy ) (here the force / is expressed in 
N/cm 2 rather than in N, which implies that 
the dimensions of viscosity are g/cm*s). 

9 - 25 Sir George Gabriel Stokes (1819-1903) 
was a British mathematician and physicist. 
Formula (9.14.2) is valid for vRp/r\ < 5, 
where p is the density of the medium. The 
reader can easily see that the quantity vRp/r\ 
is dimensionless . It is called the Reynolds 
number (abbreviated: iV Re ) after the British 
physicist and engineer Osborn Reynolds 
(1842-1912). 
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By Newton’s second law, 
m ( du/dt ) = —kv. 

Dividing both sides by m and setting 
klm = a, (a > 0), we get 

< 9 - 14 - 3 > 

The solution to this equation, as we al- 
ready know (e.g. see Chapters 4 and 8) 
is 

v (i t ) = v 0 exp (—a (t — t 0 )), 

(9.14.4) 

where v 0 ( = v ( t 0 )) is the value of the 
velocity at time t = t 0 . Since a is pos- 
itive, it follows that for t > t 0 the 
exponent in (9.14.4) is negative, 
exp (— a ( t — t 0 )) < 1, and, hence, 
v (t) < y 0 , that is, the velocity falls 
off with the passage of time — the medi- 
um retards the motion of the body, as 
expected. 

Let us find an expression for the dis- 
tance traveled by the body. From 
{9.14.4) it follows that 

dx 

— v o exp ( — a(t — £ 0 )) , or 

dx — y 0 exp ( — a (t — 1 0 )) dt. (9.14.5) 

Suppose that at t = t 0 (initial time) 
the body is at the origin: x (t 0 ) = 0. 
Integrating (9.14.5) we find that 

t 

x ( t ) =y 0 j e~ a ^- t ^dt i 

t° 

that is, 

x(t)=-%-[ 1 — *-«<*-*.>]. (9.14.6) 

Using formula (9.14.6), we can find 
the entire distance covered by the body 
after time t 0 , which is after the force 
ceases to act, up to the moment when 
the body stops. To this end, we note 
that for very large values of t, the quan- 
tity exp ( — a (t — £ 0 )) is extremely 
small (exp ( — a (t — 1 0 ))<C 1) and can 
be neglected. Therefore, the body trav- 
els a total distance of vja. 

Let us now examine the fall of a body 
in air (allowing for air drag). We send 


the x axis downward toward the ground, 
and place the origin at a height above 
the ground (i.e. x = H on the ground). 
Let the motion begin at t = 0 
with a velocity u 0 . Then x (0) =0 
and v (0) = v 0 . The body is acted upon 
by two forces: the force of gravity (as- 
sists the downward motion) and the 
force of air resistance (inhibits the mo- 
tion). 

By Newton’s second law, 

m ~jj = mg — kv. (9.14.7) 

Dividing all terms of this equation by 
m, we get dvldt = g — av (since 
klm = a), or 

ir = a (4 — ") • < 9 -' 4 - 8 ) 

We establish the dimensions of g/a. 
Since a = klm and k = — Flm, it 
follows that a has the dimensions of 
s" 1 . The dimensions of gl a are then 
cm- s/s 2 = cm/s, which means that 
gl a has the dimensions of velocity. 926 

Set g!a — v v Equation (9.14.8) 
takes the form 

■|jr =a(v t — v). (9.14.9) 

Suppose that v 0 is less than v v Then 
the right member of (9.14.9) is positive 
at the start of the motion and, hence, 
the left member is positive too, dvldt > 
0; therefore, the velocity v ( t ) in- 
creases. And the closer v is to i? 1? the clos- 
er dvldt is to zero and, consequently, 
the slower v grows. If at some time t x 
it is true that v (i x ) = v x , after this v 
remains constant, since v = is the 
solution to Eq. (9.14.9) with initial 
condition v (£ x ) = v x . Similarly, if at 
the start of motion v 0 > v u velocity 
v approaches v x , too, but in this case 
v decreases. For this reason, a cer- 
tain time after the start of motion the 
body begins to fall at practically a con- 

9.26 The calculation performed here of the 
dimensions of g/a is a check. The dimensions of 
g/a are evident from formula (9.14.8). Since 
only quantities having the same dimensions 
can be subtracted, it follows that g/a must 
have the dimensions of velocity, v. 
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Figure 9.14.1 

stant velocity of iq = g (a), irrespec- 
tive of the velocity it had at the start of 
fall. The graph of velocity for v 0 = 0 
is shown in Figure 9.14.1, where v = 
g/a is the horizontal asymptote to 
this graph. 

The foregoing examination shows that 
a number of properties of v (t) can be 
detected even without solving Eq. 
(9.14.9). Now let us solve this equation. 
p u t — v = z. Then dz/dt = —dv/dt, 
and Eq. (9.14.9) can be rewritten as 
dzldt — — az. At t — 0 it must be true 
that z = iq — v 0 . The desired solution 
is z ( t ) = (v t — u„) e ~ at ■ Passing to the 
function v ( t ), we get 

iq — v (f) = (iq — u 0 ) e~ at , or 
v ( t ) = iq -f- (v 0 — iq) e (9.14.10) 

Considering this equation, it is easy 
to see that we can draw the same conclu- 
sions as were evident from our rough 
analysis of Eq. (9.14.9). If v 0 > iq 
then v (: t ) > iq, since (u 0 — v 1 )~ at e>0; 
but if v 0 < iq, then (u 0 — v 1 )e~ at < 0 
and, therefore, v ( t ) < iq. Second- 
ly, no matter what u 0 is, the quantity 
e~ at is small for sufficiently large t, 
and, practically speaking, v ( t ) = iq. 

Using (9.4.10), we find that 

--^v^ivo-vje-at (9.14.11) 

(th^ reader will recall that v == dx/dt), 
whence, knowing that x (0) = 0, we 
have 

X (t) = Vit+ — 

(9.14.12) 

If the body has a large velocity and 
considerable dimensions, the force of 


resistance is proportional to the square 
of the velocity . It has been experimen- 
tally established that in this case 

F=-kSp-^-, (9.14.13) 


where S is the cross-sectional area of 
the body, and p is the density of the 
medium. 9 27 We see that the force of 
resistance in this case is practically in- 
dependent of the viscosity of the medi- 
um. The coefficient k in this formula is 
a dimensionless number, and its magni- 
tude depends on the shape of the body 
(for streamlined bodies, k can drop to 
0.03-0.05, while for bodies with poor 
streamline characteristics, k can reach 
1.0-1. 5). Denoting /c£p/2 by x, we ob- 
tain 

F = — xz; 2 (f). (9.14.14) 

It is clear that x has the dimensions of 
g/cm. 

Let us solve the problem of decelera- 
tion for the force of resistance (9.14.14). 
The appropriate equation is of the form 

m ( dv/dt ) = — xv 2 . 


Dividing both sides by m and setting 
xhn = p, with P > 0, we get dvldt = 


dv/v 2 = 


\Vo 


-P dt . 

= -P* 


"Pz; 2 , whence 
tegrating, we get 

l v u I 

where v 0 is the velocity of the body at 
time t = t 0 . Therefore, — v' 1 -f Vq 1 = 
— P ( t — t 0 ), whence 


v = 


v o 

1 + P*4)0 — ^o) 


(9.14.15) 


From the formula p = xhn we find 
that P has the dimensions of cm -1 . 

| Note that for t — t 0 ^> l/p^ 0 » the ve- 


9 * 27 This formula is valid for Reynolds 
numbers Rvp/r\ >* 100. The meaning of the 
formula given in the text is that for the mo- 
tion of a large body, the energy expended on 
overcoming the resistance of the medium is 
not spent on the friction of layers of the 
liquid but on increasing the kinetic energy of 
the liquid, which is forced to move in order to 
let the body pass through it. The reader is 
advised to derive the formula for the force 
(cf. Section 9.15). 
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locity practically coincides with the 
initial velocity: v ~ ~ Q/ 1 , . .) 

p(t — t 0 ) ) 

Let us find the formula for the dis- 
tance. Using (9.14.15), we get 


dx = 


1 + (t— to) 


dt , 


whence 


to 

(9.14.16) 

Assuming that the body begins to move 
from the origin, x (t 0 ) = 0, we get, 
from (9.14.16), 


j:(0 = yln[l + pi; 0 (*-* a )]. (9.14.17) 


It is easy to see that this formula cor- 
responds to the velocity being an ex- 
ponential function of the distance trav- 
eled: v = v Q e~^ x . If we now wish to 
find the entire distance traveled by the 
body after the force that imparted the 
velocity ceases to act, we will see that 
this distance (formula (9.14.17)) in- 
creases without limit with time. (Note 
that by formula (9.14.17) x oo 
as t-* oo.) 

Actually, this is not so. The point is 
that when the velocity of the body is 
small, the relation (9.14.14) no longer 
holds true. We have to resort to (9.14.1) 
and, for the distance, to formula 
<9.14.6). 

Let us consider the problem of a body 
falling in air for the case of air drag 
being proportional to the square of 
the velocity. At start, let the body fall 
from the origin with an initial velocity 
v 0 . In analogous fashion to the case 
where the resistance is proportional to 
the velocity, we get the equation 


^r=s-0» 2 - 


or -df = P • 

(9.14.18) 


It is easy to establish that (g/p) 1/2 has 
the dimensions of velocity. Set 


(g/P) 1/2 = v x . Then g ! P = yj, and Eq. 
(9.14.18) assumes the form 

-§- = PK-y2). (9.14.19) 

An exact solution to Eq. (9.14.19) 
is given in the answers to Exercise 
9.14.1. Let us consider the general prop- 
erties of the solution. Reasoning in a 
manner similar to the way we did for 
Eq. (9.14.9), we can show that in this 
case a velocity of = (g/p)V2 should 
set in. We will show that after a long 
enough time lapse following the start 
of fall, the formula 

v — v x = C exp (— 2(k> 1 £), (9.14.20) 

with C a constant, will hold true. Equa- 
tion (9.14.19) can be rewritten thus: 

±.^-^(v l + v)(v 1 -v). (9.14.21) 

For t large, v ~ v x , and so in this equa- 
tion we replace 4- v by 2v x . But if 
we replace v by v x in v x — v, we get 
dv/dt = 0, which yields v = constant = 
v x . Since we are interested precisely 
in the small difference between v and 
u 1 (the law by which v approaches u 1 ) 1 
it is not permissible to ignore the differ- 
ence v — u x . Thus, we replace (9.14.21) 
with 

~=2^ Vi ( Vi -v). (9.14.22) 

Put v 1 — v = z. Then dz/dt — 
—dv/dt, and Eq. (9.14.22) assumes 
the form 

-~= — 2(Ji; 1 z. (9.14.23) 

The solution to this equation is 
z = C exp (— 2(3i; 1 £), (9.14.24) 

which coincides with (9.14.20). 

The value of C in (9.14.24) cannot be 
determined from the initial condition 
v (0) = v 0 (or z (0) = v 1 — y 0 ) because 
Eq. (9.14.22) holds true only for suffi- 
ciently large fs (near t = 0 we cannot 
replace v + v 1 by 2v 1 ). 

Also observe that the formula 
F ( t ) = —kv 2 ( t ) is valid only when 
v > 0. Indeed, if v < 0, then it must 
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be true that F (t) = ku 2 (t), since the 
force of resistance is in opposition to 
the velocity and, hence, is positive if 
the velocity is negative. Both cases 
(u > 0 and v < 0) are embraced by the 
formula 

F (*) = — m (t) | v ( t ) |. 

Exercises 

9.14.1. Find the expression for the velocity 
as a function of time from the equation dv!dt = 
P (y? — v 2 ) with the initial condition v (0) = 
v 0 . Using the formula for v, show that a veloc- 
ity of v x = (g/p) 1 / 2 sets in. Show that C in 
(9.14.24) (or (9.14.20)) is equal to 2v 1 (v 0 — 
»i)/( v i + v 0 ). 

9.14.2. In the problem of falling bodies 
(the resistance is proportional to the velocity) 
have regard for the fact that the body is acted 
upon by an expulsive force in accord with the 
Archimedean law. 

9.14.3. Applying the result of the preced- 
ing exercise to a ball and noting that for a 
sphere k = 6jii?r), where R is the ball’s radius, 
and r] is the viscosity of the medium, demon- 
strate that for large values of t the velocity of 
the ball, y, is equal to 2 R 2 g (p — p')/9q, where 
p is the density of the material of the ball, and 
p' is the density of the medium. 

9.15* The Motion of a Body in a Fluid 

Let us now discuss in somewhat greater 
detail the physical nature and the re- 
gularities that characterize the forces 
that act on a body moving in a con- 
tinuos medium, say, in air or in water. 
In view of the great importance of this 
question to technology, especially to 
aviation, it has been studied in great 
detail and constitutes the subject of a 
special branch of science, fluid dynamics . 
The particular cases discussed in Sec- 
tion 9.14 correspond to extreme simpli- 
fication of the true laws of motion of 
fluids. 

An assumption that is more exact 
than the one formulated in Section 
9.14 is that the force of resistance of 
the medium on a given body can be 
written in the following form: 

F = _ kS ^ , (9.15.1) 

where S is the cross-sectional area of 
the body (cm 2 ), p is density of the me- 


dium (g/cm 3 ), and k a dimensionless 
quantity (cf. (9.14.14)). If we do not 
restrict our discussion to slow motion, 
the coefficient k is a function of two di- 
mensionless quantities, namely, the 
Reynolds number A Re (see footnote 
9.25) and the Mach 9 28 number A Ma , 
which is the ratio of the velocity u 
of a body with respect to the surrounding 
fluid to the speed of sound c in the me- 
dium, or Am a = v!c. Hence 

k = k(N Re ,N M a ). (9.15.2) 

In the particular case of a body moving 
with a velocity that is low compared 
to the speed of sound, or ^Ma < 1, we 

have 


k (A Re , Am a = 0 ) = k (A Re ). (9.15.3) 


The behavior of this function depends 
on the shape of the body, on the orien- 
tation of the body with respect to the 
direction of the velocity, and, finally, 
on what quantity is chosen as the char- 
acteristic size in the definition of the 
Reynolds number. Only in the simplest 
case of a ball do all these three factors 
play no role. 

For low Reynolds numbers, A Re < 1> 
the asymptotic behavior of (9.15.3) is 


k 


constant 

N Re 


(9.15.4) 


where we have left out the terms that 
are small compared to the leading term. 
Substituting into (9.15.1) the value of 
k obtained through disregarding such 
terms in (9.15.4) and allowing for the 
fact that A Re — vRp/r) (by definition), 
we get 

r constant X r\ pv 2 c 

F ~ ^p 2 “ S 

— constant x r] Rv, (9.15.5) 


where the constants in the middle and 
right members of (9.15.5) are, of course, 
different . 

We have again arrived at the Stokes 
law (see Eq. (9.14.2)), but what is im- 


9. 28 Ernst Mach (1838-1916), an Austrian 
physicist and philosopher. 
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portant here is that in this approxima- 
tion the force of resistance is indepen- 
dent of the density of the fluid. The last 
transformation in (9.15.5) is based on 
the fact that the cross-sectional area S 
of the body is proportional to the square 
of the linear dimensions of the body, 
i? 2 (here we consider geometrically sim- 
ilar bodies), and it is for this reason 
that only the first power of R remains 
in the expression for the force. One must 
bear in mind, however, that for 
any finite value of 7V Re formula 
(9.15.4) is only approximate— it con- 
tains correction terms that are infinite- 
simal compared to the leading term 
only for very low values of iV Re . 

In the other limiting case, iV Ra 
oo, the situation proves to be more 
complicated. Here k can be assumed 
constant, as was done earlier; although 
the assumption that k is constant is ra- 
ther crude, but in a book for beginners 
it is quite appropriate, we believe. In 
many textbooks on mechanics and exte- 
rior ballistics, the equations of motion 
are integrable explicitly on the assump- 
tion that k is constant, and F = con- 
stant X v 2 . But the aerodynamicist 
will say that k depends, even if weakly, 
on iV Re for large values of the Reynolds 
value. For instance, for a ball at 
Nne = 100, k ~ 1.2, while at iV R e = 
2 X 10 5 , k ~ 0.4. Then, in a rela- 
tively narrow region of Reynolds num- 
bers from 2 X 10 5 to 5 X 10 5 , the force 
of resistance diminishes by a factor of 
approximately three, after which, up 
to N-R e = 10 8 , it may be assumed that 
k = 0.12. On the whole, we see that 
when iV Re changes by a factor of a mil- 
lion (10 8 -f- 10 2 == 10 6 ), k changes by a 
factor of 10 (from 1.2 to 0.12), that is, 
over a broad range of values of A^ Re 
the following interpolation holds true: 
kCCNnl /Q - In elementary (or “crude”) 
calculations we can always substitute 
a constant for the extremely slowly va- 
rying function iV R( J /6 (provided that 
the velocity varies within an interval 
that is not very broad). 

Let us now discuss the physical pic- 


ture of the motion of a liquid around a 
body and the origin of the forces acting 
on the body. Here it is more convenient 
to transform to a system of coordinates 
in which the body is fixed, that is, 
picture a fixed body in a flowing liquid. 

Prior to contact with the body, that 
is, far from the body, in the direction 
opposite to that of the velocity of the 
liquid, the entire liquid moves with 
the same velocity (both in direction 
and magnitude). Near the body the flow 
changes its behavior. Clearly, the com- 
ponent of the velocity perpendicular 
to the surface of body is zero— no liq- 
uid flows in, or out of, the solid, that 
is, no liquid crosses the solid boundary 
of the body. But if the liquid has viscos- 
ity, it can be stated that the other 
component, the one tangent to the sur- 
face of the body, is zero too — in this 
case the velocity of the liquid at the 
surface of the body is zero. 

Two forces act at the surface of the 
body: the force of pressure normal to 
the surface and the viscous force tan- 
gent to the surface and directed in the 
same direction as the velocity of the liq- 
uid at a small distance from the sur- 
face. The viscous force is equal to 
T] (du t Idn ) s , where v t is the tangential 
component of the velocity, and d/dn 
stands tor the derivative along the 
outward normal n to the surface of the 
body; at the surface v t = 0, but the 
derivative dvjdn differs from zero. 

If it is known how the liquid moves at 
each point near the surface of a body, 
the calculation of the pressure and 
the viscous force poses no difficulty. 
However, the calculation of the motion 
is extremely difficult, and in general 
form is inaccessible at pressent— the 
calculation has been carried out only 
for a few simple cases. 

In slow motion, the viscous forces 
play the principal role. The velocity 
distribution in various fluids with 
various viscosities will be similar when 
the fluids flow around geometrically 
similar bodies of different dimensions; 
for this reason the derivative du t Idn at a 
given point on the surface of the body 
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is proportional to the velocity v divid- 
ed by the size R of the body, which 
means that the viscous force per unit 
surface area is of the order of F t = 
r\v/R. The total force is F = SFi , 
where the surface area S for geometri- 
cally similar bodies is proportional to 
the square of the characteristic linear 
-dimension: S = constant X R 2 . Thus, 

F = constant X (9.15.6) 

It can be demonstrated that, in the 
case of slow motion, allowing for the 
pressure distribution over the surface 
changes only the constant. The Stokes 
law (9.14.2) for a ball of radius R, 
where the constant is equal to 6jt (here 
the characteristic size of the body is 
the ball’s radius), is a particular case 
of the general formula (9.15.6). The 
product 6n in (9.14.2) was obtained 
through calculations and agrees well 
with experimental data. However, for 
an arbitrary nonsymmetric body the 
calculation becomes very complicated 
even in the limiting case A Re -»* 0. 

Even more complicated is the case 
where a large body is in the stream of a 
liquid with large Reynolds numbers. 
The natural way to approach this prob- 
lem is to allow for the inertia of the 
liquid’s motion; a change in direction 
and magnitude of the velocity depends 
then on the nonuniformity in the pres- 
sure distribution. 

When the stream (or jet) is decelerat- 
ed, there appears a pressure of the or- 
der of pv 2 . Since the viscous force is 
proportional to the first power of the 
velocity, it can be ignored in rapid mo- 
tion with large Reynolds numbers. It 
would seem that the result is simply 
Fi = pv 2 , or F = constant X p v 2 S, 
with a k that is constant. 

However, reality proved to be more 
complicated than this. In the 18th 
-century the French mathematician, 
philosopher, and physicist Jean Le 
Rond d’ Alembert found that for a li- 
quid whose viscosity is zero there is a 
solution to the equations of motion in 
which the pressure at different points 
on the surface is different but the resul- 


tant of the forces of pressure is identi- 
cally zero . In reality this situation ne- 
ver occurs since even the smallest vis- 
cosity modifies the flow considerably. 
A new phenomena emerges: although 
the oncoming flow is time independent, 
near the surface of the body and be- 
hind it the motion of the liquid changes 
in a random manner (the picture 
of the motion is like that of a flame of 
a bonfire). The phenomenon is known as 
turbulence , or turbulent flow . 

The rise in pressure at the leading 
side of the body (on which the stream 
hits the body) is not compensated for 
by the pressure behind the body. The 
modification of the flow brought on by 
the viscosity of the liquid proves to be 
significant and results in a nonzero 
force of resistance, or drag. As noted 
earlier, the numerical value of this re- 
sistance depends very little on the 
Reynolds number, that is, depends but 
weakly on the viscosity. 

In the system of coordinates where 
the body is in motion and the liquid is 
initially, that is, in the absence of the 
body, at rest, the cylinder of liquid 
through which the body travels during 
time t (the mass of this cylinder is 
M 1 = p Svt) will part to let the body 
through and will acquire a velocity of 
the order of v. Hence, the energy of this 
cylinder is E = M x v 2 ! 2 = tpSv*!2. 
This energy has to be taken away from 
the body; in other words, the body per- 
forms work. The work done by the body 
over time t is force times the distance: 
A = Put. Bearing in mind that 
Fvt = Spv 3 l2, we arrive at the fol- 
lowing formula: 

F=^f~, (9.15.7) 

which correctly yields the order of mag- 
nitude of this quantity. The liquid 
needs a certain time to “quiet down” 
after the body has passed through it, 
but its energy transforms into heat in- 
stead of returning to the body. 

The way in which a liquid flows around 
a body depends essentially on the 
body’s shape. Certain shapes have been 
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found that have ft’s that are smaller 
than the ft’s for a ball and disk by a 
factor of several tens (it is assumed 
that the ft of the disk is measured when 
the disk is perpendicular to the flow). 

The general idea is that the liquid, 
after it has let the body through, smooth- 
ly “closes ranks” behind the body. 
If this is so, the contribution of the 
difference in pressures diminishes and 
becomes 10 to 15% of the entire force 
of resistance. Even at high Reynolds 
numbers, the greatest part of the force 
of resistance (85 to 90%) directly acting 
on the body’s surface is provided by 
the viscous force. But this does not mean 
that the total resistance force and 
ft are proportional to the viscosity for 
streamlined bodies for high values of 

^Re. , f 

The viscosity, it appears at hrst 
glance, enters the formula for ft (and F) 
in the first power. However, the viscous 
force is proportional to r\ (dv t ldn) and, 
all other things remaining the same (the 
dimension of the body, the velocity of 
the liquid at infinity, and the like), 
the transient layer within which the 
velocity varies gets thinner as r\ de- 
creases, that is, the derivative dv t ldn 


grows. Even for streamlined bodies and 
high Reynolds numbers, ft is proportion- 
al to an extremely low power of r), 
precisely, it changes like r| 1/6 . (Some 
researchers prefer formulas of the type 
k = (a + fringe)" 1 .) 

This is also the situation in the extreme 
case of a liquid flowing around a thin 
plate lying along the flow. It would 
seem that in the limit of small viscosi- 
ties the liquid should slip along the 
plate, since the pressure contributes 
nothing and the friction force is pro- 
portional to the viscosity and is small. 
But in reality turbulence sets in (at 
N^ e > 10 6 ) and the drag grows. 9 29 
If the shape of the body or its position in 
in relation to the flow is nonsymmetric, 
for instance, the plate is at an angle to 
the flow, there appears, in addition to 
the force parallel to the velocity of the 
liquid, a perpendicular component— 
the hydrodynamic lift (or aerodynamic 
lift if the fluid is air), which can exceed 
the drag by a factor greater than 20. 
The flight of airplanes and gliders is 
based on this fact. 

9 - 29 In calculating the Reynolds number we 
substitute the length L of the plate for the 
radius R of the ball: iV Re = p vL/r\. 
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Chapter 10 Oscillations 


10.1 Motion Under the Action 
of an Elastic Force 

Let us examine the case where the force 
acting on a body depends solely on 
the position of the body, F = F (x). 
Above (see Section 9.2) we considered 
in detail the work done by such a force 
and found out that in this case the sys- 
tem has a definite potential energy 
u (x) with which the force is connected 
by the relation 


Let us now turn to the problem on 
the motion of a body under the action 
of such a force. The basic equation of 
motion is of the form 

m^ r =F{x). (10.1.1) 

This equation cannot be solved di- 
rectly since it involves the derivative 
with respect to time, while the force is 
given as a function of the x coordinate. 
It is therefore natural to seek the quan- 
tities of interest as functions of the x 
coordinates. For one thing, we will seek 
the velocity as a function of the coordi- 
nate, that is, v = v (x). We will then 
represent the derivative with respect 
to time, dvldt, as the derivative of a 
composite function, since the x coor- 
dinate itself depends on the time: 

dv [x (£)] dv [x (£)] dx (t) 

dt dx dt 

But dxldt is nothing other than the 
velocity v (x). Thus, dvldt = v ( dv/dx ), 
whence 

dv dv d l mv 2 \ / A r\ a a \ 

m ~dt =mv -w=ta hr) • ( 10 - 1Ja ) 

Substituting this into the equation of 
motion (10.1.1), we get 

£(■=?■) = f m - I* 0 - 1 - 2 * 


Integrating, we find that 

V 0 Xq 

or H*-™±=^F(x)dx. 

X 0 

The physical meaning of this expression 
is quite clear: the change in kinetic ener- 
gy is equal to the work done by the force. 

Using potential energy u = u (x), 
we can write Eq. (10.1.2) as 

d / mv 2 ^ _ du 

dx V 2 I dx ’ 

or ib( I Tr+ u ( x) ) = 0 ■ 

If the derivative of an expression is 
identically zero, the expression itself 
is a constant, therefore 

+ u (x) = constant. (10.1.3) 

In this form, formula (10.1.3) expresses 
the law of conservation of energy , 
namely, when a body is in motion due to 
the action of a force depending exclu- 
sively on the position of the body (its 
coordinate), the sum of the kinetic 
energy of the body, mv 2 /2 , and its po- 
tential energy u {x) remains constant. 
This sum is called the total energy 
of the body. 

The purpose of these lengthy trans- 
formations here and in Section 9.14 
was to show that the law of conserva- 
tion of energy (as applied to mechanics) 
is a consequence of Newton’s second 
law. Also a corollary to Newton’s sec- 
ond law is the fact that the kinetic 
energy of a body is precisely mv 2 l 2 
and not some other function of the ve- 
locity of the body. Of course, the value 
of the law of energy conservation is not 
limited by the simple examples consid- 
ered in this book. This law remains 
valid within a much broader class of 
phenomena, even such phenomena whose 
essence has yet to be clarified. 
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How do we continue the solution of 
the problem of the motion of the body? 
Using the value of the velocity v 0 and 
the x coordinate x 0 of the body at the 
initial time (t = t 0 ), we find the total 
energy E of the body, which is a quan- 
tity that remains constant throughout 
the motion: E = mv\/2 + u (x 0 ). With 
the aid of formula (10.1.3) and know- 
ing j E, we can find the velocity of the 
body as a function of x: 

V(x)= / d~[E-u(x)] . (10.1.4) 

It remains to find the relationship 
between x and t. From (10.1.4) we get 

•£=/ 1 -'*— <*»' 

whence 


f 2 ’ 

X 

, , f dx 

* = *o + ) — 7 =—= . (10.1.5) 

X* y — [E — u (x)] 

Thus, the time t is expressed as a func- 
tion of the x coordinate: t = t (x). 
This function is given by an integral. 
We can find x = x (t) by solving (10.1.5) 
for x. Since v is expressed as a function 
of x by means of a square root (Eq. 
(10.1.4)), even a simple expression for 
u = u (x) frequently leads to extreme- 
ly awkward expressions for t = t (, x ). 

To get a general idea of the nature 
of the motion, it will be useful to draw 
the curve of u (x). Suppose the graph 
of u = u (x) has the form depicted in 
Figure 10.1.1. If on the same graph we 
plot the horizontal line at height E 



Figure 10.1.1 


(the straight line u = E), the picture is 
vivid in the extreme. Intuitively, it is 
clear that here the body will move be- 
tween two extreme points, or that the 
motion is oscillatory. The velocity is 
proportional to the square root of the 
difference E — u (x). For instance,, 
when x == x A , the velocity is propor- 
tional to the square root of the length of 
the line segment AB (see Figure 10.1.1). 
To the right of point C and to the 
left of point D is a region where E < 
u (x), that is, a region that a body 
with a given total energy E cannot 
reach for lack of energy (this is reflected 
in the fact that in (10.1.5) there appears 
a root of a negative quantity). In the 
region where E > u (x), the square 
root yields two possible values for 
v (. x ) in accord with two signs of the 
radical: 

v= ±y -^[e-u{x)\ . 

At the initial time, both the quanti- 
ty and the sign of u 0 are determined by 
the initial conditions. From then on r 
the motion occurs in the direction spec- 
ified by the sign of the initial velocity 
v 0 . Evidently, at v =^= 0 the sign of the 
velocity cannot change suddenly. For 
instance, if at the initial time the body 
was located at the point x = x 0 (ly- 
ing somewhere between points x D and 
xE) and v 0 > 0, the body will reach the 
extreme admissible point xc- 

At point xc, where the body’s velo- 
city vanishes, the formula 
v = Y (2/m) [E — u (x)l is replaced by 
v = — ]/ (2/m) [E — u (x)]. Since 
v = 0 at this point, the change in sign 
takes place without a jump (disconti- 
nuity) in the velocity. The situation is 
similar at the point x = x D . Thus, in 
the case depicted in Figure 10.1.1, the 
motion of the body will be in the form 
of oscillations between the two ext- 
reme positions, x c and x fJ . 

Let us consider another example. 
Suppose the potential energy of a body 
is given by the function u = ax, with 
a positive. Find the law of motion of 
the body. 


23 * 
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We denote the values of the coordi- 
nate x and the velocity v at the initial 
instant of time, t 0 , by x = x 0 and v = 
v 0 . Then the total energy E is equal 
to mv\l2 + ax 0 . Using (10.1.4) yields 

v ( x ) =: + ax o— ax) 

■with y = 2 a/m. Knowing v (x), we 
find the time 

X 

C dx 

f — f -U \ = . 

Y»l+ Y(*o — *) 

x 0 

Since under the integral sign we have 
—y~ x u~ 1,z du, where a = v\ + y (x 0 —x), 
we can write 

t = t 0 — y V u l + y ( x o x ) Uo 

= t 0 ~Y Wv\ + y (x 0 — x) — vj. 

From this we find x: 

-f- (t -t 0 )=— YY+n^) + »o- 

Transposing u 0 to the left, squaring, 
and canceling y, we obtain 

x = — | - (t — ^o ) 2 ~ h* 

(10.1.6) 

Finding d 2 x/dt 2 we are convinced that 
what we have is a case of uniformly de- 
celerated motion . This was to be ex- 
pected since F = —du!dx = —a, that 
is, the force is constant and negative, 
which means that the motion is uniform- 
ly decelerated. In this elementary 
case where the force was actually inde- 
pendent of x , there was of course no rea- 
son for employing such a compli- 
cated computational procedure. 

In the next example (in this connection see 
also Chapter 16) we will consider the potential 
energy whose graph is in the form of a step 
(Figure 10.1.2a). Such a function u (x) is as- 
sociated with the graph of force given in Fig- 
ure 10.1.26 (to convince oneself that this is 
so, the reader should recall that F = — du/dx ): 
the force is extremely great and negative, that 
is, in the direction of decreasing x. The steeo- 
er the curve representing u (x) is (the curve 



Figure 10.1.2 

in Figure 10.1.2a), that is, the shorter the in- 
terval A = Xi — Xq over which the rise in 
u (x) occurs, the greater is the force in absolute 
value. The force is zero where the function 
u — u (x) is constant (to the left of x 0 and to 
the right of x x ). 

Let a body start moving from x 0 (see Fig- 
ure 10.1.2a) with a velocity i; 0 . Suppose the 
total energy of the body is equal to E . For what 
values of E can the body reach point x ± ? 
Since u (x 0 ) = 0, it follows that E = mv%!2 . 
On the other hand, E — mu f/2 + u v where v x 
is the body’s velocity at point x ± , and u x is 
the potential energy for x = x v Therefore 

-^=E- Ul . (10.1.7) 

From this formula it is evident that if E 
is lower than u v the body cannot reach x 1 
because then vf becomes negative, which is 
impossible. For this reason, the body can reach 
x = x 1 only if E ^ u ± . 

For this case we determine the work done 
by the force F in displacing the body from x 0 
to xy. 

_ mv\ mvl _ mv f 

A 2 2 2 
Using (10.1.7), we find that 
A = E — u x — E = — u x . 

The force F does not perform any work in fur- 
ther motion of the body to the right of point 
x l7 since F = 0 for x >> x x . 

Exercises 

10.1.1. The potential energy is given by 
the formula u = kx 2 /2, where k is positive. 
Construct a graph and show that the motion is 
oscillatory. 

10.1.2. The potential energy u (x) is zero 
for x c 0, equal to 2x for 0<x<l, and 
equal to 2 for x ^ 1. At the initial time t 0 
a body of mass 1 gram leaves the origin and 
moves rightward with a velocity u 0 (cm/s): 
(a) v 0 = 1, (b) v 0 = 1.9, and (c) v 0 = 2.1. 
In each case investigate whether the body can 
continue moving indefinitely to the right. 
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If it cannot, find the point at which it will 
stop. 

10 . 1 . 3 . u (x) = — x 3 + Ax 2 . At the initial 
instant of time a body of mass 2 grams starts 
out from a point x 0 at a velocity v 0 (cm/s): 

(a) z 0 =1, v 0 = 1, (b) x 0 = —2, v 0 = 1, 
and (c) x 0 — —2, v 0 = — 1. In each case in- 
vestigate the nature of the body’s motion (points 
at which the body stops, regions which the 
body cannot reach). In the case of stopping 
points, give at least a rough idea of their coor- 
dinates. 

10 . 1 . 4 . The same requirements relative to 
u (x) = x 2 /{l + x 2 ) as in Exercise 10.1.3, 
with m = 2 grams: (a) x 0 = 0, v 0 = 2 and 

(b) Xq = 1/2, v 0 = 1/2. Express the time t 
as a function of x in terms of an integral. 

10.2 The Case of a Force Proportional 
to Deviation. Harmonic Oscillations 

Consider a body acted upon by a force 
F = — kx. As we know, to this force 
there corresponds a potential energy 
u = fcr 2 /2. The origin is the position 
of stable equilibrium. The curve of po- 
tential energy (parabola) has the shape 
shown in Figure 10.1.1. 

The motion of a body under the ac- 
tion of such a force constitutes oscilla- 
tions to the left and to the right of the 
position of equilibrium. We can ima- 
gine a ball rolling from one branch of 
the parabola, building up speed and, 
by inertia, rolling up the other branch, 
rolling down again, and so forth. 10 1 
By Newton’s second law, the equation 
of these oscillations is 

r] 2 y 

m ~lk = ~~ kx ' (10.2.1) 

We will not solve it by the general and 
rather involved procedure given in Sec- 

10 - 1 Of course, this picture does not con- 
stitute an exact description of the process we 
are interested in, since the velocity of the 
body rolling from one branch of the parabola 
is directed along a tangent to the parabola 
and can be decomposed into two components, 
the horizontal v x and the vertical v z . The 
law of energy conservation includes, naturally, 
both quantities, u x and v z , in view of which 
the force on the body (more precisely, the 
horizotal component of this force) will not be 
exactly equal to — kx (think of the reasons 
for this). But if the parabola is so gently slo- 
ping that v z is moderate, the “rolling from 
the parabola” can be described as oscillatory 
motion obeying formula (10.2.1). 


tion 10.1. Instead we will guess the 
type of solution and concentrate our 
attention on investigating the proper- 
ties of the solution. 

Thus, we assume that 

x = a cos cot. (10.2.2) 

This form of the solution is chosen be- 
cause the cosine is one of the simplest 
periodic solutions, and the process we are 
interested in (“oscillations”) is without 
doubt a periodic one. Substitut- 
ing (10.2.2) into the basic equation 

(10.2.1) yields 

— ma co 2 cos co£ — — ka cot c of , (10.2.3) 

since 

dx 

v — = — aco sin c ot, 

dt 

— tf(o 2 cosco£. 

Formula (10.2.3) is valid for any t 
if raco 2 = k. Therefore, the function 

(10.2.2) does indeed satisfy Eq. (10.2.1) 
ii m<x> 2 = k, whence 

< 10 ' 2 - 4 > 

Thus 

x—a cos [t]/ . (10.2.5) 

Observe that the square root in the ex- 
pression for co does not lead to two solu- 
tions since cos co t — cos ( — co£). 

We sought the solution to Eq. 
(10.2.1) in the form (10.2.2), but it 
is quite clear that the sine is no worse 
than the cosine. 10 - 2 Equation (10.2.1) 
then has another solution: 

x (t) = b sin cot. (10.2.2a) 

10 - 2 Both functions, x ± = a sin co t and 
x 2 = b cos a )£, are characterized by the prop- 
erty that their second derivative is propor- 
tional to the respective function (the propor- 
tionality factors are negative: x'{ = — ( o 2 x 1 
and x 2 = —co 2 x 2 ). The difference lies only 
in the initial conditions (at t = 0): while 

x i (0) = 0, for x 2 the initial rate of its change 
is zero (x 2 (0) = 0) instead of the function 
itself. 
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Indeed, d 2 (sin co t)/dt 2 = — co 2 sin coL 
Substituting this function x (£) and 
its second derivative into Eq. (10.2.1) 
and canceling sin c ot, we get co = 
Y k/m, which coincides with 
(10.2.4). 10 * 3 Whence 


a = C cos cp , b = — C sin cp, (10.2.8) 
where C and cp can be expressed in 


terms of a and 


(10.2.8a) 


sin ( t y r - 


(10.2.5a) 


The reader can easily see that the 


C = y a z+b 2 , tan q> = b/a, 

z . . (10.2.8a) 

cp == arc tan ( — b/a ) . 

Let us find the period of oscillation, 

T, that is, the time during which the 
body returns to the initial velocity 

1 (see' J formulas (Iu.'2n0) "arid. (IU.z.ij 
(10.2.6) The function cos a (or sin a or < 
(10.2.2a) (a + cp)) returns to its initial va 
tion. (To when the angle makes a complete 
erivative volution (changes by 2 ji). Thus, in 
into Eq. expression a cos c ot quantity co shoi 
en abso- vary through 2n in one period T. The 
ye the so- fore, T is found from the condit: 


x = a cos co t b sin co£ (10.2.6) 

of the solutions (10.2.2) and (10.2.2a) 
to Eq. (10.2.1) is also a solution. (To 
verify this, find the second derivative 
of (10.2.6) and substitute it into Eq. 

(10.2.1) , a and b can be chosen abso- 
lutely arbitrarily.) Thus, we have the so- 
lution (10.2.6) to Eq. (10.2.1) contain- 
ing two arbitrary constants, a and b. 

The solution (10.2.6) to Eq. (10.2.1) 
can be written in a somewhat different 
way. Let us consider a solution to Eq. 

(10.2.1) of the form (10.2.2) shifted by 
a phase angle cp: 


T) = at 


whence 


co T = 2jt, T - 


The quantity v == i/T gives the nu 


x = C cos (cd£ + cp), (10.2.7) 

where the numerical factor of the tri- 
gonometric function is now denoted by 
C . Since here, too, x = — co C sin 
(co£ + cp) and, hence, x" = — c o 2 C X 
cos (co£ + cp) = — co 2 #, the function 
(10.2.7) is also a solution to Eq. 
(10.2.1) provided condition (10.2.4) is 
met. 10 * 4 But since cos (a + P) = 
cos a cos P — sin a sin P, we can 
rewrite (10.7.7) thus: 


ber of oscillations per unit time a 
is called the frequency . It has the 
mensions s _1 = 1/s (reciprocal secon 
The unit of frequency— one cycle, 
oscillation, per second — goes by i 
special name hertz (abbreviated: H 
in honor of the German physicist He 
rich Hertz (1857-1894). It is evide 
from formula (10.2.9), that v — co/! 
but it is more convenient in all forn 
las to deal with co and not with 
otherwise the coefficients 2 n and *■ 


x = C cos cp cos c ot — C sincp sin co t, will appear throughout. The quant 

which means that (10.2.7) and (10.2.6) © = 2n/T is called the circular f, 

represent the same solution only if quency 10 6 , and cp in (10.2.7) is cal’ 

— the initial phase angle . 

10 • 3 The frequency co = — y k/m does not 

yield a new solution since b sin t) = l0 . 0 f course, the simple formula (( 

b sin (y k/m t). Thus, the value co — arctan (—b/a) is not sufficient for de 

— V k/m yields the same solution (10.2.5a), mining cp completely— we must take into 

but with a different coefficient b. count the signs of cos cp and sin cp determi 

10 - 4 Clearly, a shift in the independent from (10.2.8). 

variable of a trigonometric function does not 10 • 6 To grasp the origin of this term, c 

affect the differential properties of this func- sider a line segment of length a revolt 

tion if x — cos (t + a) = cos t, where counterclockwise. The similarity between 

t = t + a with a constant, then tation and oscillation is quite appan 

dx/dt — (dx/dx) (dx/dt) — ( dx/dx ) X 1 = since a revolving hand of a clock returns 

dx/dx. (Going from (10.2.2) to (10.2.7) is its original position after each revolution ; 

the same as a shift in time: t-> t 4- t 0 , with as an oscillating body returns to its origi 

= cp/ co.) position after one period. Here the x coo 
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The constant a in (10.2.2) cannot be 
determined from Eq. (10.2.1) because 
the equation is satisfied for arbitrary a , 
which can be cancelled from both mem- 
bers of (10.2.3). The same is true for 
the constant b in (10.2.2a) and for 
both constants a and b in (10.2.6), 
the constants C and cp in (10.2.7) are 
also arbitrary (check this statement!). 

If the motion of the body is described 
by (10.2.2), the velocity is 

v= - — a co sin (10.2.10) 

From the relation cos 2 cot-\- sin 2 co£ =1 
it follows that for cos (of = ±1 
it will be true that sin oot = 0. Con- 
sequently, the velocity v is equal to 
zero at times of maximum deviation of 
the body in one direction or the other 
(x = a or x = ~a). Imagine that at 
t < 0 the body is placed at the point 
x = a and is held at rest in this point 
with the aid of some external force (say 
a hook) until time t = 0, and then the 
hook is disengaged. At this instant the 
body is at rest, and oscillations are 
initiated under the action of a force 
F = — kx. In this case, the dependence 
of the coordinate x of the body upon 
time t is given by the formula x = a 
cos cot . Since the absolute value of 
cos (x)t does not exceed unity, it follows 
that a is the largest value of x, or 
the maximum deviation of the body from 
the position of equilibrium . The con- 
stants b in formula (10.2.2a) and C 
in (10.2.7) have the same meaning. 
The number a (or b or C, respectively) 
is called th e amplitude of the oscilla- 

nate of the end of the clock hand varies accord- 
ing to the law x = a cos cot if the hand re- 
volves with an angular velocity co. If T is the 
period of one revolution, then v = i/T is the 
number of revolutions per unit time and 
o) = 2jtv is the angular velocity of rotation 
expressed in radians per second. Since the 
radian is a dimensionless quantity (see Appen- 
dix 5), (o has the dimensions of 1/s. In view 
of the simple meaning of co in the case of 
circular motion, this quantity in problems 
involving oscillations is termed the circular 
frequency . 


tions . From (10.2.2) and (10.2.10) it 
follows that the amplitude is equal to 
the original deviation of the body if at 
the onset of oscillations the body was at 
rest. 10 * 7 

Let us note in passing that, general- 
ly speaking, if A (t) = L cos at (or 
A(t) = L sinco£), then L is the grea- 
test value of the quantity A (£) and is 
termed the amplitude of A (£). We note 
also that the oscillation frequency co 
is independent of the amplitude a. 

Let x — x x (t) be a solution of 
Eq. (10.2.1), that is, let it be true that 
m (d 2 xjdt 2 ) = — kx x . We consider the 
function x 2 (t) = cx 1 ( t ), where c is a 
constant. Substituting into Eq. 
(10.2.1) the quantities x 2 and d 2 x 2 /dt 2 , 
we get me {d 2 xjdt 2 ) = — kcx 1 ( t ), or, 
cancelling out c, 

d 2 x 1 1 

Thus, if x = x x (£) satisfies Eq. (10.2.1), 
so does x 2 (t) = cx 1 ( t ). Similarly, if 
x x ( t ) and x 2 (t) are two solutions to 
Eq. (10.2.1), x (£) = x x (£) + x 2 (£) is 
also a solution. It is in this manner 
that we constructed the general solu- 
tion (10.2.6) from the solutions (10.2.2) 
and (10.2.2a). The velocity correspond- 
ing to solution (10.2.6) is 

dx 

v = = — a co sin co£ + &co cos cot. 

(10.2.11) 

Suppose we have taken x = a cos co£ 
as the solution to Eq. (10.2.1). Putting 
t = 0, we get x 0 = a , where x 0 = x (0) 
is the initial position of the body, hence 
x = x 0 cos cot. But then v = dxldt = 
— x 0 co sin co£, so that at t = 0 we 
have v = 0. Therefore, using the solu- 
tion x = a cos cot, all we can solve is 
the problem with zero initial velocity . 

Let us try the solution x = b sin co£. 
Here v = dxldt = 6co cos co£, and at 

10 • 7 We defined the amplitude as one half 
of the full span of one oscillation. Between 
the extreme left-hand point x — — a and 
the extreme right-hand point x = a, the body 
travels a distance of 2a, which is twice the 
amplitude. 
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t = 0 we get u 0 = fcco, where v 0 = 
v (0) is the initial velocity, and 

b = v 0 l(x), (10.2.12) 

whence x = (y 0 /co) sin co£. But at 
t = 0 we have only x = 0. Again, we 
are unable to solve the general problem 
with the aid of this solution — we must 
assume that the oscillations started at 
the point of equilibrium x — 0. 

The relation (10.2.12) offers a prac- 
tical and convenient device for mea- 
suring impulse and velocity that is 
widely used in mechanics and goes by 
the name ballistic pendulum. If a body 
is suspended in the form of a pendulum 
or is held in a position of equilibrium 
by springs and its frequency is known, 
the initial velocity following a blow 
may be determined from the amplitude 
of oscillations caused by the blow. 

We will now show how (10.2.12) can be ap- 
proximately obtained via some general elemen- 
tary reasoning. The dimensions of amplitude 
are cm, those of velocity cm/s, and those of 
the oscillation period are s. Therefore, on di- 
mensional grounds, the amplitude must be 
of the same order as the product of the initial 
velocity into some portion of the period. Since 
motion caused by a blow lasts a quarter period 
up to maximum deviation and v < v 0 (be- 
cause the motion is retarded), it follows that 
b < u 0 Tl 4. If the motion has a constant decel- 
eration, the average velocity would be equal 
to half the initial value and, consequently, 
b ~ u 0 T/8. Actually, however, as follows from 
formulas (10.2.9) and (10.2.12), 

v 0 T _ v 0 T 

b 2 ' 2jt 6.28 ’ 

which is quite close to the result obtained via 
the approximate approach. The important thing 
here is that because the period is independent 
of the amplitude, the latter is directly pro- 
portional to the initial velocity. 

We have verified that two different 
functions, (10.2.2) and (10.2.2a), satis- 
fy the equation m (< d 2 x!dt 2 ) = — kx. 
Suppose we want to solve the problem 
of the motion of a body having a given 
initial position and a given initial ve- 
locity, that is, x = x 0 and u = u 0 
at t = 0 (and the values of x 0 and v 0 
are arbitrary, each may not be equal 
to zero). We will call this the general 


problem. Up to now, in contrast to the 
general problem, we have only consid- 
ered particular problems in one of 
which x = x 0 and v = 0 at t = 0 and 
in another v = u 0 at t = 0 but x = 0 
(at t = 0). 

With the help of (10.2.6) we can solve 
the general problem of the motion of a 
body with an arbitrary position and an 
arbitrary velocity at the initial time: 
x = x 0 and u = u 0 at t = 0. Using the 
initial conditions, we find from (10.2.6) 
and (10.2.11) that a = x 0 and b = 
vj(ti. Whence 

x = x 0 cos (tit -j — si n cot . (10.2.13) 

The solution (10.2.6) is termed the 
general solution to Eq. (10.2.1), while 
(10.2.2) and (10.2.2a) are particular 
solutions to Eq. (10.2.1). 

The motion described by (10.2.6) 
or (10.2.7) is known as harmonic oscil- 
lations. 

If the solution is written in the form 
(10.2.7), then it is clear that the ampli- 
tude is equal to C. Hence, if the solu- 
ti on is of the form (10.2.6), C = 
]/ a 2 + b 2 . Suppose x = x 0 and v = 
v 0 at t = 0. Then a = x 0 and 
b = vjtti and so the amplitude is 

C =V x\ + -W- (10.2.14) 

Exercise 

10.2.1. A body is in oscillation according 
to the law d 2 x/dt 2 = — x. Find the function 
x = x (t) and determine the period for the 
following cases: (a) x — 0 and v = 2 cm/s 
at t = 0, (b) x = 1 and v = 0 at t = 0, and 
(c) x = 1 and v = 2 cm/s at t = 0. For case 
(c) write the solution in the form of (10.2.6) 
and as (10.2.7). 


10.3 Pendulums 

A model problem closely associated 
with the entire theory of harmonic oscil- 
lations is the pendulum problem. 
Imagine a small (point-like) mass m 
at the end of a weightless cord of length 
Z, with the other end suspended from a 
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Figure 10.3.1 


fixed point (Figure 10.3.1). The system 
is called a simple pendulum , and the 
mass is called a bob. Suppose the bob 
(and the cord) is deflected through an 
angle cp from the vertical. One of the 
forces acting on the bob is mg exerted 
by gravity, where g is the acceleration 
of gravity. This force is directed down- 
ward. The normal component of this 
force, that is, the one perpendicular to 
the path of the bob and directed out- 
ward from the fixed point, is balanced 
by the force exerted by the cord, while 
the tangential component of the force 
of gravity, that is, the one tangent to 
the bob’s path, is equal to — mg sin cp; 
it is the latter component that drives 
the pendulum. (The minus sign in the 
expression for the tangential component 
shows that the component tends to 
diminish the angle cp). Since the path 
s that the bob travels from the equilib- 
rium position 0 is h p, the bob’s velocity 
is l (dy/dt) and the acceleration is 
l (d\/dt 2 ). Newton’s second law for 
this case is 


(10.3.1) is then replaced with the fol- 
lowing approximate equation: 


d 2 cp 
dt 2 



(10.3.1a)* 


Clearly, Eq. (10.3.1a) is simply Eq. 

(10.2.1) , which describes harmonic os- 
cillations, with g/l playing the role* 
of klm (if we divide both sides of Eq. 

(10.2.1) by m). Therefore, the solution 
to (10.3.1a) is given by (10.2.7), with, 
to = Ygil. The oscillation period of a 
simple pendulum in this approxima- 
tion is given by the formula 

r = -?5. = 2nl/T. (10.3.2> 

0) V g 


For instance, if we require that T = 2 s,. 
or that it takes one second for the pen- 
dulum to swing in one direction, then 
we get (putting g = 9.8 m/s 2 ) l = 
gt 2 /4n 2 ~ 0.993 m, or l ~ 1 m. 10 * 
Here is a somewhat different approach 
to the same simple pendulum prob- 
lem, which will be more convenient 
when we turn to the problem of a physi- 
cal pendulum. Suppose the simple pen- 
dulum of Figure 10.3.1 at x = 0 (the 
equilibrium position) has potential 
energy u 0 , its kinetic energy at this 
point is zero. We deflect the pendulum 
through an angle cp; its horizontal 
deflection is then x (see Figure 10.3.1). 
In this position the pendulum potential 
energy is u ± = u 0 + mgz , where 

2 - l — Y l 2 — x 2 , andi the kinetic 
energy is (m/2) ( dx/dt ) 2 (see Section 
9.6). The sum of potential and kinetic 


ml = — mg sin cp, 

(10.3.1) 

d 2 i p g . 

or-^f=- T sincp. 

The exact solution of Eq. (10.3.1) 
is in no way simple (see Eq. (7.10.14) 
and the material that follows). But 
if cp is small, we can replace sincp 
with the first term in the Taylor expan- 
sion of sin cp (see Section 6.2). Equation 


10 * 8 In the days of the French Revolution, 
when the metric system was being adopted 
in France, two measures of length were candi- 
dates, so to say, for the basic unit of length. 
One was the length of 1/10 000 000 of the 
distance from the equator to the North pole, 
measured along the circle (called a meridian 
circle) passing through both poles and Paris 
(this length was called a meter and formed the* 
base of the new metric system), while the 
other was the length of a “one-second” simple 
pendulum (the latter definition depends, obvi- 
ously, on location because g varies in different 
parts of the world). Both units are of the same 
order of magnitude and agree with the charac- 
teristic dimensions of the human body. 
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energies will not change in the oscilla- 
tion process, whence 

u 0 + mg(l-~Y l 2 — x 2 )+ J j (-^r) = u o, 


Now we employ the fact , that 
x<^l (the deflection angle cp is samall), 
x/l<^i 1. This enables representing 
x 2 in the form 


Y p — x 2 


A 

to 

1 



i__ ( X \ 2"| _ , x 2 

Y [TJ J “ 1 ~~ 21 


(here we retain only the first two terms 
in the expansion of ]/" 1 — ( xll ) 2 = 
11 — (xll) 2 ] 1 ! 2 in powers of xll via 
"Maclaurin’s formula; see Chapter 6), 
nfter which the equation of pendulum 
^oscillations assumes the form 


Differentiating both sides of the above 
equation with respect to t twice, we get 

^ dx d 2 x __ 0/v x dx 

‘ Z ~dt dt 2 “ “ ^ T dt ’ 

whence 


d 2 x _ g 

dt 2 ~ ~"T 


(10.3.1b) 


This constitutes a different form of the 
equation of small oscillations of a 
(simple) pendulum; the reason why it 
is similar to Eq. (10.3.1a) is that for 
small x and cp we have sin cp = xll ~ cp, 
that is x ~ L 

The ideas developed here can be ap- 
plied to the study of the oscillatory mo- 
tion of a solid fixed at a definite “point 
<of suspension”. For the sake of simplic- 
ity we will speak of a rod suspended at 
a point A (Figure 10.3.2). Let the cen- 
ter of gravity be below the point of sus- 
pension, the distance between the point 
of suspension and the center of gravity 
being Z. Let us determine the period 
oi oscillation of the rod (the physical 
^ pendulum ). 

If the pendulum is deflected from its 



position of equilibrium through a small 
angle cp, the difference between the 
current height z of the center of gravity 
of the rod and the lowest position of 
this center, corresponding to the posi- 
tion of equilibrium, is l — l cos cp 
(cf. the aforesaid). The potential ener- 
gy u of the rod can be written as 

u = mgz = mg (l — l cos cp) 

= mgl (1 — cos cp ). (10.3.3) 

We expand cos cp in a series in powers 
of cp and, since cp is small, confine our- 
selves to the first two terms: cos 
cp ~ 1 — cp 2 /2. Therefore approximate- 
ly we have u ~ mgl cp 2 /2. Clearly, 
this relationship implies that the po- 
tential energy grows with angle cp, 
that is, as the distance from the posi- 
tion of the center of gravity to the po- 
sition of equilibrium (where cp = 0, 
2 = 0, and u = 0) increases. 

The kinetic energy of rotation of the 
rod about point A is 

< )0 ' 3 ' 4) 

where I is the moment of inertia of the 
rod (about the point of suspension). 
But according to (9.12.13), I = ml 2 + 

/ 0 , whence K -- ~~ (ml 2 + I 0 ) , 

where I 0 is the moment of inertia of 
the rod about the center of gravity. 

Suppose that the rod performs harmo- 
nic oscillations, that is, cp = a cos c ot. 
By the law of conservation of energy, 
u + K = constant and ^max — ^-max? 
which means that K max is attained at 
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the position of equilibrium, where 
u = 0, while M m ax is attained at the 
position where K = 0. Since dyldt = 
— aco sin coZ, we have 

-^raax = i (ml 2 + I 0 ) « 2 W 2 , 

7 

^raax = mgl 2 » 

whence 

•tngl = \ ( m l 2 +- / 0 ) « 2( ° 2 . 

that is, 

< ,0 - 3 ' 5 > 

The oscillation period is 

T = 2k/co. (10.3.5a) 

Formulas (10.3.5) and (10.3.5a) imply 
that the greater the moment of inertia 
J 0 of the rod, the lower the frequency 
of oscillations of the physical pendulum 
and, hence, the greater the period T. 

If the entire mass of the rod is con- 
centrated at the center of gravity, then 
/ 0 = 0. In this case we obtain the well- 
known formulas (10.3.2) for the oscil- 
lation period (and frequency) of a sim- 
ple pendulum. 

If / 0 =/== 0, there is a definite position 
of the point of suspension for which the 
frequency is at a maximum . Since the 
position of the suspension point is 
characterized by Z, to find the position 
we desire let us solve the equation 
dtoldl = 0. After simple algebraic ma- 
nipulations we arrive at an equation 
for Z: 

mg (ml 2 + I 0 ) — mg l x 2 ml = 0, 

(10.3.6) 

whence we have Z max = ]// 0 /m. 

For a rod of length L with uniform 
distribution of mass, 7 0 = mL 2 ! 12 
(see Exercise 9.12.1) and therefore 

Jmax - L/V12 ~ 0.3 L. 

Note that the minimum frequency, 
o) — 0, is attained, obviously, at 
l = 0, that is, when the point of suspen- 
sion coincides with the center of grav- 


A 



B 


Figure 10.3.3 


ity of the rod, or when no oscillatory 
motion takes place. 10 - 9 But Z = 0 is 
not a root of Eq. (10.3.6), that is, 
do)/dl does not vanish at this point. 
Here we are dealing with a so-called 
cuspidal minimum attained at the 
boundary of the region over which Z 
varies (see Section 7.2). 

Exercise 

10.3.1. A pendulum is in the form of a 
triangular lamina (tin or cardboard) (Fig- 
ure 10.3.3). Determine the period of oscillation 
if the pendulum is suspended (a) from the 
acute end A , and (b) from the middle of the 
base, or point B. In both cases, indicate how 
the point of suspension is to be displaced so as 
to obtain a minimum oscillation period. 


10.4 Oscillation Energy. 

Damped Oscillations 

Let us write down the general solution 
to Eq. (10.2.1) (the general harmonic 
oscillation) as x — cos (coZ + a). The 
potential energy of the oscillating body 
is equal, at every instant, to 

u (x (i t )) = 2 ^ = — 2 ~ G0S M + °0 ’ 


and the kinetic energy is 


K (t) = -= -^- [ — Cm sin (coZ -f a)] 2 . 


mC 2 co 2 
2 


sin 2 (coZ + a). 


The oscillation frequency, as we al- 
ready know, is determined by the formu- 
la co 2 = klm. Substituting co 2 into the 
expression for the kinetic energy, we get 

K (t) — ^ 2 ' — sin 2 (co Z + a) . 

10 • 9 The value co = 0, is, of course, purely 
nominal; it means that 0 as l-> 0 (the 
closer the suspension point is to the center 
of gravity of the rod, the greater the peri- 
od T). 
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so 


xxxxx 


< 2 ? =sLn 2 (a)t+cC) 


= cos 2 (cot ■*■(<) 


0 1 

Figure 10.4.1 


/r 


Thus, the factor in front of the tri- 
gonometric function in the expression 
for potential energy and in the expres- 
sion for kinetic energy is the same. The 
functions themselves, cos 2 (c o£ + a) and 
sin 2 (co£ + a), are very much alike, 
one being derivable from the other by 
a displacement along the time axis 
amounting to A t = nl 2co (Figure 
10.4.1). Each of the quantities u and 
K oscillates between maximum and 
zero: when one is at maximum, the oth- 
er is at zero. Observe that the func- 
tions cos 2 (co t -j- a) and sin 2 (co£ + a) 
describe oscillations about the mean 
(average) value equal to half the maxi- 
mum. This is clearly evident from Fig- 
ure 10.4.1 and also from the familiar 
formulas 

cos 2 P = ~ (1 + cos. 2(3) — -~+ cos 2p, 

sin 2 13 =-~ (1 — cos 2(3) = -^ ^-cos 2(3. 

(10.4.1) 

It is clear here that the quantity 
(1/2) cos 2(3 oscillates, taking on positive 
and negative values alternately, and 1/2 
is the mean value of cos 2 p. The sum 
of the potential and kinetic energies 
(that is, the total energy of the system). 

E= K+u= — y- [cos 2 (©f + a) 

+ sin2((of + a)] = ig- , (10.4.2) 

is constant, as was to be expected. 

Note that if we were to specify the 
motion with a frequency that did not 
satisfy the formula co = ]/ klm, the 
sum of the potential and kinetic ener- 
gies would not be a constant and the 
maximum kinetic energy would not be 
equal to the maximum potential ener- 


gy. This is not surprising since oscil- 
lations with a frequency different from 
( 0 =^ klm do not satisfy the basic equa- 
tion of motion. Hence, for such oscil- 
lations to occur, it is necessary that 
the body be driven by some kind of 
other, external, forces besides the force 
F = — kx (this force is associated with 
the potential u (. x ) = kx 2 l2). Because 
of the work of external forces, the total 
energy mv 2 ! 2 + kx 2 ! 2 will no longer be 
conserved. 

Let us now investigate the problem of 
damping of oscillations. Let a body be* 
acted upon by the force of friction in 
addition to the force F = — kx of a 
sping. Suppose the friction is so small 
over one period that the work of the 
friction force is small compared with 
the oscillation energy. We can then as- 
sume approximately that the oscilla- 
tions occur as in the case of no fric- 
tion: x (t) = C cos (cd£ + cp) (see Eq. 
(10.2.7)). The oscillation energy is 
kC 2 /2. In the case of friction, the ener- 
gy of the oscillations diminishes with 
the passage of time. Thus, friction will 
cause the amplitude C in (10.2.7) to 
decrease slowly instead of remaining 
constant. The law of decrease of C 
is defined by the condition that the de- 
crease in total energy E is equal to the 
work of the force of friction F x . 

With respect to unit time, these quan- 
tities are related thus: 

te = W&^ = kC dC =F w 

at dt dt 

(10.4.3) 

where v is the velocity of the body, and 
W 1 is the power of the friction force. 
In the oscillation process, both v and 
F x vary periodically. The friction force 
F x is always dependent on velocity v and 
is directed opposite to v , so that the 
product Fpu always remains negative. 
In the case at hand of small friction, 
which results in a slow damping of the 
oscillations, we can assume that the 
change in amplitude C ( t ) is small over 
several oscillation periods. The product 
F ± v may be understood as the average 
value of the product over one period. 
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Formula (10.4.3) holds true only for 
time intervals exceeding one oscilla- 
tion period. 

By way of an illustration, let us 
examine the effect of a force of fric- 
tion proportional to the first power of 
the velocity of the body: 

F i = — hv , F t v= — hv 2 . (10.4.4) 

If v = dx/dt— — Cco sin (cot -f- cp), then 
I’ 1 v = — feC 2 co 2 sin 2 (co£ + cp). (10.4.5) 

Note that the average value of 
•sin 2 (co/ + (p) over one period is 1/2 
(see formula (10.4.1) and Section 7.8). 
Using (10.4.3) and (10.4.4), we finally 
get 

s(¥) *T. 

whence 

dC ha 2 r 

dt ~ 2k 

that is (cf. (6.6.10)), 

€ = C 0 e~ yt , (10.4.6) 

where 

y = ha 2 /2k. (10.4.6a) 

But since co 2 = klm , formula (10.4.6a) 

can be simplified: 

y = h!2m. (10.4.6b 

In formula (10.4.6), C 0 is determined 

by the initial conditions. Multiplying 
both sides of (10.4.6) by cos (co£ + cp) 
and employing (10.2.7), we get 

x (£) = C 0 e~ yt cos (cot + cp), (10.4.7) 

where co = ]/ klm (see a characteristic 
graph of damped oscillations in Figure 
10.4.2). This is an approximate formula 
obtained on the assumption that the 
friction is low and is proportional to 
the first power of velocity. 

In all cases where the friction force 
is proportional to the first power of 
velocity, the problem has an exact so- 
lution. The body is acted on by two 



Figure 10.4.2 

forces, the elastic force — kx and the 
force of friction — h(dxldt). By New- 
ton’s second law, 

( 10A8 ) 

We will seek the solution x(t) in the 
same form as was obtained for a small 
friction. 

x(t) =C 0 e-v t 'cos(a) i t-{-a). (10.4.9) 

This solution depends on four para- 
meters: C 0 , y, tOi, and a. The values 
of y and co t are determined from Eq. 

(10.4.8) , while C 0 and a remain arbit- 
rary, that is, Eq. (10.4.8) does not 
determine them, and require specifying 
the initial conditions. Indeed, from 

(10.4.9) we obtain 

— - = — yC^e-vt cos (co^ + a) 

— C 0 (o i e~^ t sin (co^ -f a), 

%r = y*C 0 e-y*co3 (©,* + <*) 

+ yC 0 (o i e~Y t sin (co^ 4- a) 

+ C , o c °i7^ _vt sin + a ) 

— Coale'Vt cos (co^ + a) . 

Substituting into (10.4.8) the expres- 
sions for x, dx/dt , and d 2 xldt 2 and can- 
celing out [C 0 e~ yt > we get 

my 2 cos (o)it + a) + my(£> 1 sin (co^+a) 
+ my (iq sin (co^ + a) 

— mco 2 cos ((o ± t + a) = — k cos^J+a) 
+ hy cos (co^ + a) 

+ fecoi sin (co^ +a), 
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or 

( my 2 — mco 2 ) cos (co^ + a) 

+ 2myco 1 sin (co^ + a) 

= ( — k + hy) cos (co^ + a) 

+ ha> 1 sin (co^ + a). (10.4.10) 

The last equation holds true for ar- 
bitrary t if 

my 2 — mco 2 = — k + hy , 

2myco 1 = h (o^ (10.4.11) 


into Eq. (10.4.8), we arrive at quadratic 
equation for y with two solutions: 
y = y x and y = y 2 . The sum C x e~ yit + 
C 2 e~ yzt of the two solutions corre~ 
spondingtoyj and y 2 will yield the gen- 
eral solution to Eq. (10.4.8) and will 
enable solving the problem for arbitrary 
initial data. This case of large h is con- 
sidered in detail in Section 13.10 in 
connection with electrical oscillations*. 

Exercises 


Canceling (o 1 out of the second equa- 
tion, we obtain y = h/2m. Then from 
the first we can find co x : 


2 k h 2 

CO? = 7 — 5— . 

1 m 4m 2 


(10.4.12) 


Consequently, 
x(t) = C<fi 2m 

XC °S (y r ~t~-^2- t + a ) ■ (10.4.13) 


10.4.1. Find the law of damping of oscil- 
lations for friction proportional to the second 
power of velocity (this friction is characteris- 
tic of rapid motion of a body in a low-vis- 
cosity fluid). Show that after the lapse of a large 
time interval, the amplitude C (t) = i/bt, 
where b is a constant independent of C 0 , which 
is the amplitude at the initial time. 

10.4.2. Find the law of damping of oscil- 
lations for friction that is independent of ve- 
locity (this is characteristic of the friction be- 
tween hard dry surfaces). Determine the time 
interval after which the oscillations cease. 


When friction is low, that is, h is 
small in comparison with fc, we can ig- 
nore fo 2 /4m 2 under the radical sign as 
compared with khn . Then (10.4.13) 
becomes (10.4.7). Thus, in the approx- 
imate consideration we correctly ob- 
tained the law of decrease of the oscil- 
lation amplitude but did not notice the 
small variation in frequency due to 
friction. Note also that of the two laws 
that govern the decrease in amplitude, 
(10.4.6), (10.4.6a) and (10.4.6), 

(10.4.6b), only the second holds true 
for arbitrary friction (not too large, that 
is, such that h 2 < 4 km and the root 
in (10.4.13) is meaningful), while the 
first coincides with the second at small 
k’s but in the general case is invalid 
because it contains co (which is not 
generally a constant). 

If the friction is so high that h 2 > 
f imri , "tire rati icanu * xn ' ( lb 1 id) 1 n eco m es 
negative and the formula becomes 
meaningless. This means that motion 
involving appreciable friction is no 
longer oscillatory. In this case the so- 
lution to Eq. (10.4.8) is to be sought in 
the form x = Ce~~ yt . Substituting this 


10.5 Forced Oscillations 
and Resonance 

Consider a body acted upon by an elas- 
tic force F = — kx. We have estab- 
lished that this force causes the body to 
oscillate with a definite frequency 
co = y klm, which is known as the 
natural (or free) frequency. From now 
on we will denote the natural frequency 
by co 0 so that co 0 = ]/ khn. 

Now let the body be acted upon by 
the elastic force and a periodic exter- 
nal force with frequency co, which gen- 
erally may differ from co 0 . It then 
turns out that the amplitude of the os- 
cillations brought about by the exter- 
nal force is very strongly dependent on 
how close the frequency co of the exter- 
nal force is to the natural frequency 
co 0 , that is, to the frequency of natural 
oscillations occurring in the absence of 
external forces. The phenomenon in 
which the oscillation amplitude in- 
creases as co approaches co 0 is called res- 
onance and has a wide range of appli- 
cations. It refers to any systems that 
admit oscillations and vibrations. In 
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mechanical systems (machine tools or 
motors) such vibrations can result in 
deformation and destruction of the 
equipment. On the other hand, resonance 
can be useful, at times it is purposely 
used to produce, via a small force, vi- 
brations of the operating tool with great 
amplitude. 

In electric systems, resonance en- 
ables us, using several periodic forces 
with different frequencies (say, a num- 
ber of radio transmitters), to achieve a 
situation in which the oscillations in 
our system depend solely on one of 
the periodic forces (the one whose fre- 
quency is close to the natural frequency 
of the system). This allows for tuning 
a radio set to a definite station. 

Let us set up the oscillation equation: 

m — kx — + f cos (tit, 

(10.5.1) 

where the terms — kx and —h (dxldt) 
correspond to the elastic force and the 
friction force (see Eq. (10.4.8)), and 
/ cos (tit represents the external force. 
Divide both sides of (10.5.1) by m 
and set klm = co^ in accordance with 
the fact that (in the absence of friction) 
co 0 is the natural frequency of oscilla- 
tions of the body. Denote the ratio 
him by 2y (see formula (10.4.6b)). 
Then Eq. (10.5.1) assumes the form 

d^x 2 dx . f , 

— =- % x-2y — + — co Sti >l. 

(10.5.1a) 

It is natural to expect that under the 
action of a force having a frequency co 
the body will oscillate with that fre- 
quency. We therefore seek the solution 
to Eq. (10.5.1) or (10.5.1a) in the form 

x = a cos at + b sin (tit. (10.5.2) 

Substituting the expressions for x and 
its derivatives into Eq. (10.5.1a), we 
obtain 

— a(ti 2 cos (tit — b(ti 2 sin (tit 

= — a(til cos (tit—b(ti\ sin (tit-\-2ya(ti sin (tit 

— 2yb(ti cos (tit + — cos (tit. 


For this equation to be true for all t’s, 
the separate terms involving coscof: 
and sin (tit in the right- and left-hand 
sides must be equal. Thus, we arrive at 
the following system of equations: 


— ao 2 = — < 20 )q — 2yb(ti -f- ~~ , 

— b(ti 2 = — b(ti\ -f- 2ya(ti 


(10.5.3) 


(cf. Eq. (10.4.11)). The second equa- 
tion yields 


6 = 


2y(ti 
cog — (0 2 


a. 


Substituting this into the first equation* 
in (10.5.3) yields 


a = l <ti 2 o-(ti 2 

m (0)2 — (o 2 ) 2 + (2v(o) 2 * 

Then 


(10.5.4a) 


h = 1 2yo) 

m (0)2— 0)2)2 +(2Y0))2 • 


(10.5.4b)? 


Going over to the form x = 
C cos(co^+cp) for oscil lations and recalling 
that C = Y a 2 + b 2 (see (10.2.8a)), we 
obtain the amplitude of oscillations 
produced by the external force: 



1 

V" (COg — 0)2)2 + (2 T o)) 2 * 


(10.5.5} 


We see that C is the greater the closer 
(ti is to co 0 . The curve of C as a function 
of co for a given co 0 is shown in Figure* 
10.5.1 for two values of y (with flm = 1 
and co 0 — 1). The less the friction h r 
that is, the less the quantity y = h!2m y 
the sharper the rise in amplitude for 
the frequency of the external force equal 
to the natural frequency. 
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It is not hard to see that the sum of 
the solution (10.4.9) to Eq. (10.4.8) 
and the general solution (10.5.2) to 
Eq. (10.5.1), 

.x = a cos co£ + & sin oo£ 

+ C 0 e~ yt cos (co^ + a), (10.5.6) 

where a and b are given by (10.5.4a) 
and (10.5.4b), and co 1 and y by (10.4.12) 
and (10.4.6b), is also a solution to Eq. 
(10.5.1). Formula (10.5.6) provides a 
solution that satisfies any initial data 
by choosing C 0 and a. Indeed, suppose 
that x = x 0 and v = v 0 at t — 0. 
Then, using (10.5.6), we find that 

.x 0 = a + C 0 cos a, 

v 0 = b(j) — C 0 (y cos a + (Oj sin a). 

(10.5.7) 

From this system of equations we can 
-determine C 0 and a (see Exercise 
10.5.1). 

Thus, (10.5.6) is the general solution 
of the problem involving oscillations 
of a body under the action of an elastic 
force and a periodic external force. 
This general solution confirms the as- 
sumption, made at the beginning of 
this section, that under the protracted 
action of an external force with fre- 
quency co a body will oscillate with the 
same frequency co. This is true because 
no matter what the initial conditions, 
they only affect the values of C 0 and 
•a, which is to say, only the last sum- 
mand in the solution (10.5.6). However, 
in the course of time, this term, which 
Jhas frequency g) 1? becomes very close 
to zero, due to the factor e~ yt , which 
tends to zero as t->- oo, and we can 
neglect it for large t. The remaining 
terms describe oscillations with fre- 
quency co, which do not decay with the 
passage of time since they are main- 
tained by the action of the external 
force, whose amplitude is constant by as- 
sumption. 

Exercises 

10.5.1. Determine C 0 and a from the sys- 
tem (10.5.7) 

10.5.2. Because of friction, the maximum 

►amplitude C is obtained at somewhat 


different from ©§. Find the deviation of 
w^ ax / (t>o from unity as a function of y. 

Hint. Test the radicand in (10.5.5) for a 
minimum, denoting co 2 = z. 

10.6 On Exact and Approximate 
Solutions of Physical Problems 

In the preceding section we had the luck 
to find with comparative ease the 
exact solution of the problem involving 
oscillations of a body under the action 
of a periodic external force in the case 
of an elastic force that tends to move 
the body to the position of equilibrium, 
— kx , and friction, —h (dx/dt). With 
the exact solution at our disposal, we 
can easily find a number of important 
limiting cases. 

(1) The frequency co of the external 
force is extremely low compared with 
co 0 , where co 2 = klm. Neglecting co 
in (10.5.5) in comparison with co 0 we 
get 

C = f/m&l = flk. (10.6.1) 

(2) The frequency co of the external 
force is much higher than co 0 . Then we 
can neglect co 0 in (10.5.5) and assume 
that 

C = -t 1 , (10.6.2) 

m y co 4 + 4y 2 co 2 

But y 2 co 2 <C co 4 (friction is not very 

great, since otherwise there would be 
no oscillations); neglecting the term 

4y 2 co 2 , we obtain 

C -- — ^ . (10.6.2a) 

TttCO 2 V 1 

(3) The friction force is small , that 
is, h is small . Then in (10.5.5) we can 
neglect the term containing y. As a 
result we have 


(The appearance of the absolute value 
of the difference in (10.6.3) is due to 
the fact that in (10.5.5) we take the pos- 
itive value of the root, since it is clear 
that C and /, that is, the oscillation am- 
plitude and the external force ampli- 
tude, can be only positive.) 
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(4) The phenomenon of exact reso- 
nance: the frequency co of the external 
force is exactly equal to the natural 
frequency co 0 . Then (10.5.5) is replaced 
with 


c = ± 1 = / ____ / 

m hi o hi o 0 


(10.6.4) 


These limiting cases actually make 
up over 90 % of the content of all the 
results obtained. When one obtains a 
general result, it is always advisable 
to simplify it by considering various 
limiting cases, as we have just done. 
The simple formulas relating to the limi t- 
ing cases are more easily remembered 
and more frequently used in practical 
situations. Only once in a while does 
one have to resort to general formulas. 
Although limiting cases do not cover 
everything but we know almost every- 
thing that is contained in the more com- 
plicated exact formula. 

The question arises quite naturally 
as to the possibility of obtaining limit- 
ing formulas directly via simplifica- 
tions in the equation itself rather than 
in its solution. To solve an involved 
equation in exact fashion and then sim- 
plify the solution is just as senseless as 
to use intricate machinery to package 
goods elegantly and then immediately 
tear open the package to get them since 
each piece is to be used separately. There- 
fore it is important to learn to extract 
from the general solution of the problem 
the particular (and limiting) cases with- 
out solving the equation itself. 

To obtain limiting (approximate) 
expressions directly is particularly im- 
portant for the added reason that an 
exact solution is very sensitive to the 
slightest variations in the statement of 
the problem. A slight complication in 
the problem and one finds it impossible 
to get an exact solution. An approxi- 
mate solution is rougher but more sta- 
ble with respect to variations in the 
problem. 

Of particular importance to students 
are cases where it is possible to obtain 
and compare both solutions, exact and 
approximate. It is precisely in such 


cases that one can acquire some exper- 
ience in the proper choice of approxi- 
mations and be sure of the results, since 
the correctness of the limiting cases 
that follow from the general formulas 
can serve as a global check on the en- 
tire process of solving the general equa- 
tion, both the setting up of the equation 
that describes the process of interest 
to us and the solving of this equation. 

Let us now return to the first case: 
the frequency co of the external force 
is low. This is clearly a slow motion. 
Therefore, in the original equation 
(10.5.1). 

d^X 7 7 dX . e , 

m — 5 - — — kx — h — - + / cos (tit, 

dt 2 dt 

we can drop terms m (d 2 x/dt 2 ) and 
h (dxldt) involving acceleration and 
velocity, since if x ^ C cos ((tit + (p), 
then dxldt is proportional to co and 
d 2 x/dt 2 to co 2 , and for low frequencies 
these quantities are small. In this case 
we get 0 ~ — kx + / cos (tit, 
whence 

x ~ k - 1 f cos (tit = C cos (tit , 

with C = flk. (10.6.5) 

Therefore, for low frequencies of the 
external force, at each moment the ap- 
plied external force / cos (tit is balanced 
by the elastic force kx. This result is 
clearly very general, for it refers to 
any motion with a low frequency. This 
limiting case is called a static case. 
For one, the elastic force G (x) may be 
any function of the coordinate and not 
only a linear function of the coordi- 
nate like G = — kx , and the external 
force F (£) may be any function of time. 
The oscillation equation takes the form 

m J*L = G(x)-h%- + F(t). (10.6.6) 

It is not always possible to obtain the 
exact solution of this equation, but the 
approximate approach is preserved. In- 
deed, neglecting in the case of slow mo- 
tion the terms in ( 10 . 6 . 6 ) that are pro- 
portional to velocity dxldt and accel- 
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eration d?xldt 2 , we get G {x) + F (t) ~ 
0, or G (x) ~ — F (x). From this we 
find an approximate relationship be- 
tween x and t, or x ~ x (t). Substitut- 
ing this x (t) into the exact equation 
(10.6.6), we can find the order of the 
error introduced by neglecting the 
terms h(dxldt) and m (d 2 x/dt 2 ). 

Let us take a look at the second lim- 
iting case, very high frequency co. 
The time of action of the external force 
and, hence, the impulse during each 
half-cycle (while the force is acting in 
one direction) are small because of the 
short duration of the half-cycle in this 
case. Thus, for a given amplitude / 
of the external force, the higher the 
frequency co, the smaller the velocity 
that the body can acquire and the smal- 
ler the displacement of the body. Neg- 
lecting the terms kx and h (dx/dt) in 
Eq. (10.5.1), we arrive at an equation 
of motion of a free body with no forces 
acting except the external force: 

f cos (tit. (10.6.7) 

We seek the solution to this equation 
in the form x = B cos (tit. Then 
d 2 xldt 2 = — B co 2 cos (tit, and whence 
Eq. (10.6.7) takes the form — Bm co 2 x 
cos (tit ~ f cos (tit , from which it fol- 
lows that B ~ -//mo) 2 , and hence 

x ~ — jjj—p cos (tit . (10.6.8) 

In the standard form x ~ C cos 
(cd£+ cp) the solution (10.6.8) can be 
written as follows (with C positive): 

x~ cos (c tit + n). (10.6.8a) 

Here, the elastic force is 

— kx ~ cos a>t = -3* / cos co t, 

(10.6.9) 

and the force of friction is 

— h — — sin co t. (10.6.9a) 

dt m(ti v 7 

When comparing forces that depend 
periodically on time, one should com- 


pare not their instantaneous values but 
their amplitudes. The ratio of the exter- 
nal force / cos (tit to the elastic force 
(10.6.9) (the ratio of the amplitudes) 
is /- f- (cOq/cd 2 )/ — co 2 / c dJ. The ratio is 
the higher the greater co is. Similarly, 
the ratio of the external force to that 
of friction, / ~ (hf/nuti) = (m/h) co, 
grows without limit with co. For this 
reason, given high co, the external force 
appreciably exceeds both the elas- 
tic force and the force of friction. This 
supports the possibility of an approxi- 
mate consideration of motion under the 
action of an external force alone. 1010 

The third limiting case, neglect of 
friction, presents no difficulties at all. 
Here Eq. (10.5.1) takes the form 

m ~ -kx-\-j cos (tit 
= — rmtilx + / cos (tit, (10.6.10) 


since co^ = klm and, hence, k = 
rri(til. We seek the solution to this 
equation in the form x = C cos (co£ + 
cp). Substituting into this equation 
the expressions for x and d^xldt*, we 
get cp = 0, cp = Jt, and m \ (ti\ — co 2 | X 
C cos (tit ~ f cos cd£, whence 




/ 

m | cog — co 2 1 * 


( 10 . 6 . 11 ) 


Comparison of this formula with the 
exact formula (10.5.5) enables estimat- 
ing the conditions under which this 
approximation works. 

Clearly, at co — co 0 approximation 
(10.6.11) is not valid and the fourth 
limiting case (the frequency of the ex- 
ternal force is exactly equal to the nat- 
ural frequency of oscillations, or re- 
sonance) must be treated separately. 
Here we will not ignore friction, how- 
ever.) We seek the solution to (10.5.1) 


10 - 10 It is essential that the friction force 
considered above be closer to zero the 
closer the velocity is to zero. In dry friction 
(force of friction independent of velocity), an 
external force less than that of friction will 
not cause oscillations at any > frequency. 
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in the standard form x — C cos(c o 0 £ + 
<p). Then 

m ~^T = — mCa\ cos (<B 0 f + <p), 

and, since (x) 2 0 =k/m , 
d 2 x 

m — == — kC cos (co 0 £ + <p) = — kx. 

Substituting into (10.5.1) the expres- 
sions for x and the time derivatives 
yields 

hC co 0 sin (cd 0 £ + <p) + / cos co 0 £ = 0. 

This equation will hold true for any 
t if C = f/h(x) 0 and cp = — jt/2. Hence, 
the solution of our equation is 

7^ cos (10.6.12) 

The oscillation amplitude at resonance 
is C = flh (D 0 , and we clearly see that 
there is no way in which we can ignore 
the friction, or h, at resonance. 

Referring to Figure 10.6.1, let us look 
at the relationship between C and co 
provided by the approximate formulas 
(10.6.3) and (10.6.4). Formula (10.6.3) 
yields two branches of the curve C = 
C (co), branches that go off to infinity 
as co ->• co 0 , while formula (10.6.4) 
yields a finite value C = C 0 ( //feco 0 ) 
at co = co 0 . If we construct the curves 
of (10.6.3) and place the point A = 
A (co 0 , C 0 ) that corresponds to for- 
mula (10.6.4), it is then easy to draw 
free-hand a smooth curve (dashed in 
Figure 10.6.1) which, far from resonance, 
coinsides with the curves of (10.6.3) 
and has a maximum C 0 at co = co 0 . 

In the case of resonance, the ampli- 
tude C can be determined by means of 
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energy considerations , whose value con- 
sists in the fact that they also enable 
solving approximately certain problems 
that have no exact solutions. The power 
developed by an external force / cos co* 
in the case of motion specified by 
the expression x = C cos (co£ + cp) is 

W ex = / COS (x)t -~— 
at 

= — fCo: ) cos c ot sin (c ot + cp) . 

Let us determine the average power 
of the external force during a large (more 
precisely, infinite) interval of time: 

W ex = —fC co cos co£ sin (oot + cp) , 

here the bar stands for averaging (see 
Section 7.8). Note that 

cos c ot sin (cot -f- cp) 

= ~2 sin 2(0 * cos <P + cos 2 co£ sin cp, 
and so 

cos co t sin (at + <p) = ~ sin 2 at coscp 

| 

+ cos 2 at sin rp — ~ sin (p, 

since sin 2 at = 0 and cos 2 at = 1/2. Con- 
sequently, 

TIT fC(D 

w ex = 2 — Sln <P» 

which can be written as 

^ex = -^cos (cp-f-y) . (10.6.13) 

Now let us determine the average pow- 
er of the force of friction. Since 
F fr — — hv it follows that 

W ex - hv 2 . (10.6.14) 

But 

72= (ir)^ C 2 co 2 sin 2 M + cp) = -^. 

see (10.4.1)). Therefore (10.6.14) yields 
W ex = —hCz^/2. 

Since the work of the external force 
goes to overcome friction, the average 
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powers of the external force and the 
force of friction must be equal: 


w u . = w e „ 

that is, 

fC(D 


(10.6.15) 


or / 


cos 


{’P + -5-) 


C 2 G ) 2 


cos + =hC(o , 
whence 

C = -jt| cos ( , P + T) 


(10.6.16) 


The maximum possible amplitude 
(at resonance) results, as may be seen 
from (10.6.16), at cos (cp + n/2) = 1, 
that is, at <p = — n/2, and this deter- 
mines the initial phase angle cp. Here 
<o — co 0 and C = flh(o 0 . Hence, the 
solution of the equation of motion in 
the case of resonance is 


We again have the formula (10.6.12). 

Let us return to formula (10.6.13). 
From this formula we see that at reso- 
nance W ex has the largest value since 
at resonance cos (cp + n/2) = 1. For 
this reason, in the case of resonance an 
external force develops maximum aver- 
age power and, consequently, performs 
maximum work. 


These arguments of an energy nature make 
it possible to determine the amplitude at res- 
onance also in the case of a more complicated 
•dependence of the force of friction on velocity. 
Let the force of friction be proportional to a 
<fixed) power of the absolute value of velocity: 

F fr =—hv M 71 " 1 , (10.6.17) 

where h is positive, and therefore the first factor 
ensures that the signs of v and Ff r are differ- 
ent. At v > 0 this formula yields Ff r = 
— hv n , while at K0 we get Ff r = h | v | n , 
that is, in all cases the force of friction is in 
opposition to velocity. Since we assume, as 
before, that the motion is on the whole deter- 
mined by the external periodic force F = 
} cos ( 0 tj that is, is of an oscillatory nature with 
frequency co, or x — C cos (g)£ + /P), the av- 
erage power of the external force is given by 
formula (10.6.13) for this case, too. 


Let us determine Wf r . The instantaneous 
value is W [ r = vF f r = — hv 2 | u \ n ~ 1 ; since 
v 2 = \ v | 2 , we can write W\ T = —h | v | n+1 . 
Substituting here the value of y, we find that 

W fr = —hC n+1 co£ +1 |sin((D 0 *+(p)| n+1 . 

(10.6.18) 

Using this, we get l0 - u 

W tT = ~hC n+1 ( 0 ^ +1 A, A— | sin (<o 0 f + q>) | n+1 . 


The condition (10.6.15) yields hC n+1 a)^ +1 A = 

' - 1 (»+-f)i- 


fC C0 0 


cos 


whence 


v- 


f 


cos 


{ i n \ 

I'p+t) • 


2kA(,i” 

amplitude attained at resonance is 


The maximum 


C = 


1 

C0 0 


V 


2 hA 


(10.6.19) 


A particular case of formula (10.6.19) for 
n = 1 (friction proportional to the first power 
of velocity) is the earlier-found formula C — 
f /h(& 0 . Note that at n =£ 1 the equation of 
motion becomes nonlinear and we cannot write 
a general solution to this equation that 
would be similar to the one found in Section 
10.5— there is no way in which an exact solu- 
tion can be found via elementary methods 
this case. 


10.7 Combining Oscillations. Beats 

Quite common is the situation when a 
body performs several oscillations si- 
multaneously. Visible light, for in- 
stance, is the sum effect of many vibra- 
tional processes related to the various 
electron transitions in the atoms of the 
body emitting the light (see Chapter 
12). Similarly, the sound vibrations 
that we perceive when listening to a 
symphony orchestra are the sum of vib- 
rations of the air generated by each mu- 
sical instrument in the orchestra — and 
the conductor’s goal is to control the 
various instruments and hence all these 
vibrations to produce an effect in 
which all the vibrations merge into a 
beautiful melody. (The effect of purely 
random mixing of many sound vibra- 
tions is known to anybody who has lis- 
tened to an orchestra tuning up before 


10 - 11 For reference we give values of A 
for a few n: n 0, A 2/jt ^ 0.64; n = 1, 
A = 0.5; n = 2, A ~ 4/3ji~ 0.42; and 
n = 3, A = 3/8 = 0.375. 
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a concert, each musician tuning his 
instrument without regard for the 
other instruments.) There are also many 
cases (sometimes important) when two 
or more harmonic oscillations are super- 
imposed; more than that, almost any 
motion can be represented as a sum of 
many harmonic oscillations (see Sec- 
tion 10.9). 

We start with the addition of two 
harmonic oscillations. The simplest 
case here is when the oscillations are of 
the same frequency, for instance, when 
we are considering the sound produced 
by two violins playing in unison. The 
result is a harmonic oscillation of the 
same frequency. Indeed, if the two oscil- 
latory processes are described by the 
formulas 

x x = A cos cot + B sin cot, 

x 2 = C cos cot A- D sin cot, (10.7.1) 

then the sum x = x 1 + x 2 has the form 
x — {A C) cos co£ + (B + D) sin cot. 

(10.7.2) 

The relationship between oscillation 
(10.7.2) and oscillations (10.7.1) are 
most conveniently described as follows: 
if to oscillations (10.7.1) we relate 
points M and N with coordinates (^4, B) 

and (C, D) (or vectors OM and ON ), 
then oscillation (10.7.2) has correspond- 
ing to it the vertex P of the parallelo- 
gram OMPN (or the vector OP = 


+ B 2 and (ON) 2 = C 2 + D 2 (cf. (10.4.2) 
and (10.2.8a)), while the energy of the 
combined oscillation (10.7.2) is pro- 
portional to (OP) 2 . If the initial phases 
in (10.7.1) are the same, the seg- 
ments OM and ON point in the same 
direction and the length of the segment 
OP is equal to the sum of the lengths 
of segments OM and ON; the energy of 
oscillation (10.7.2) then proves to be 
much higher than the sum of the ener- 
gies of oscillations (10.7.1), since the 
energy is specified by the square of the 
amplitude of oscillation (OM, ON , 
and OP ); if the oscillations in (10.7.1) 
are the same, the energy of their sum 
will be four times the energy of each 
oscillation. If the initial phases in 
(10.7.1) are opposite (that is, differ by 
Jt), the segments OM and ON are in 
opposition; here the amplitude OP of 
the sum oscillation is equal to the 
difference of amplitudes OM and ON 
of the component oscillations. The oscil- 
lations may even “cancel out” if their 
amplitudes (or energies) are the same. 

But if the energies differ only by a small 
quantity and the phases of x x and x 2 
differ by jt, the sum oscillation will 
have an extremely low energy (see 
Exercise 10.7.1). 

The case of addition of oscillations 
with different frequencies, when the 
sum oscillation is not harmonic, is more 
complicated. Let us consider, byway 
of an example, a mechanism known as 
the crankgear , which transforms trans- 
lational motion into rotation, and vice 

”'vui!s9. r7 r?gfrnj- iu. / .2, snows'sucn a mec. 


OM + ON; see Figure 10.7.1). The anism. The crank OA is connect* 
energy of oscillations (10.7.1) is then with the slide block B (a piston movii 
fixed by the quantities (OM) 2 = A 2 4- backward and forward inside a clos* 



Figure 10.7.1 


cylinder) by means of the connectii 
rod AB; the length of the crank will ] 
denoted by R and the length of the coi 
necting rod by L. Let us assume th, 
the crank is in uniform rotation aboi 
the axis O with a frequency co, so th; 
the projection C of point A on the O 
axis of the cylinder in which the slic 
block moves performs a harmonic osci 
lation about point <9. The slide bloc 
B will also perform oscillations aboi 
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Figure 10.7.2 

its midpoint, but these oscillations 
will not be harmonic. 

Indeed, if Z_ AOB = co £, then OC = 
R cos cd£, AC — R sin c ot, and. 
hence, 

5C = l/L 2 ~i? 2 sin 2 «* 

= Ll/l-(i?/L) 2 sin 2 (of. 

Thus, we have 
x = OB — R cos (tit 

+ L ]/ 1 — (/?/L) 2 sin 2 (tit. (10.7.3) 

But in the real mechanism R is usually 
much less than L, so that the ratio 
RlL is small (often of the order of 1/5, 
so that {RlL) 2 ~ 1/25); for a small z 
we can replace Y 1 — z with 1 — zt 2 
(see Section 6.4). For this reason the 
comparatively complicated formula 

(10.7.3) can be simplified: 

( R2 \ 

1 2 JT sin 2 con 

D + , r V a 7? 2 (1— COS2©0 "I 

= R cos (tit + L^l — ~2fT 2 J 

= j + i? cos (o^-f-^-cos 2co£. 

(10.7.4) 

The first term L — i? 2 /4L specifies 
the midpoint in the position of slide 
block5 (the slide block oscillates about 
this midpoint), participating in two 
harmonic oscillations simultaneously, 
that is, in the oscillation with frequency 
co and amplitude R fixed by the motion 
of point C , and in the oscillation with 
double frequency 2co and amplitude 
R 2 IAL fixed by the change in altitude 
CA of point A (Figure 10.7.3). 



Figure 10.7.3 

A peculiar situation arises when two 
oscillations with close frequencies co x 
and co 2 are combined. If, for istance, 
Xl = A cos co^ and x 2 = A cos (o 2 t 
(note that the amplitudes of both oscil- 
lations coincide), then 

x = x i + x 2 = A cos c o { t + A cos c o 2 t 
= A (cos co^ cos co 2 £) 

= A X 2 cos ^1=^- t cos *• 

(10.7.5) 

Here, obviously, the factor 2 A X 
cos — — — — t changes very slowly, and 

the factor cos 0)1 0)2 t corresponds to 

oscillations with a frequency 0)1 ~F (0 - 2 ~ 

co. Oscillations specified by (10.7.5) are 
known as beats (see Figure 10.7.4a); 
often they are described (not quite 



Figure 10.7.4 
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correctly) as harmonic oscillations with 
frequency co ----y 2 j and a slowly 
varying amplitude A (t) = 2A x 
cos --LT 0)2 t ( = C cos Q£) . A similar 

situation arises when the phases of the 
oscillations x 1 and x 2 differ — only in 
this case the phase of the sum oscilla- 
tion x also experiences slow oscillations 
(see Exercise 10.7.2a). 

What happens if the frequencies co 
and Q of the component oscillations 
differ considerably ? Suppose, x 1 = A X 
cos cot and x 2 = B cos with Q^> co. 
Then the sum oscillation x = x ± + x 2 
can be represented in the form of a har- 
monic oscillation with amplitude B 
and frequency Q about the slowly 
varying position of equilibrium, a = 
a (t) = A cos cot (see Figure 10.7.46, 
where A^> B). 

We encounter the phenomenon of 
beats when we study ocean tides: the 
water level in oceans changes due to 
the attraction from the moon and the 
sun; in view of the earth’s rotation about 
its axis, the level oscillates with a 
period of 12 hours. But since the moon 
orbits the earth, the period of the level 
oscillations caused by the moon’s at- 
traction differs somewhat from the pe- 
riod of the level oscillations caused by 
the sun’s attraction: while the period 
of the “sun” tides is exactly 12 hours, 
the period of the “moon” tides (which 

are stronger) is close to 12^- h. Since 

these oscillations have different ampli- 
tudes, the amplitude of the sum oscil- 
lation is never zero, that is, the sum 
oscillation does not coincide exactly 
with the beats depicted in Figure 10.7.4a; 
however, here also we can speak of a 
slow variation of the oscillation ampli- 
tude of the water level, the greatest 
height of the tide exceeding the lowest 
height by a factor of almost three (see 
Exercise 10.7.2b). Oscillations with a 
varying amplitude are also used in ra- 
diobroadcasting; for instance, the elec- 
tromagnetic waves in the SW band 
with a wavelength of about 20 meters 


have a frequency of 1.5 x 10 7 Hz 
(or 15 MHz), while the amplitude of 
these oscillations varies within the au- 
dio-frequency range, that is, the range 
of frequencies to which the human ear 
is sensitive, approximately 15 to 
20 000 Hz. 

Exercises 

10.7.1. What are the (a) energy and (b) 
amplitude of the oscillations obtained through 
combining the oscillations x 1 and x 2 with the 
same frequency: x 1 = A x cos (cot + cpO and 
x 2 — A 2 cos (cof + q> 2 )? 

10.7.2. (a) What type of oscillations is the 
sum of two harmonic oscillations, x 1 = 
a cos (oV+tPi) and x 2 — a cos ((D 2 £ + cp 2 ), if (Oi 

-auk 'xx» 2 "duTe 'fib po- 

tion for oscillations x L and x 2 with different 
amplitudes, a x and a 2 . 


10.8 The Vibrations of a String 

In Section 10.7 we studied the function 
x = f (t) that was the sum of two har- 
monic oscillations (with the same or dis- 
tinct frequencies). Next we will inve- 
stigate the extremely important ques- 
tion of representing any function (contin- 
uous or even discontinuous) in the form 
of a sum of harmonic oscillations. 
Physicists and mathematicians arrived 
at this question in their studies of a 
wide range of physical phenomena. One 
of these deals with the well-known 
problem of a vibrating string . 

Let us consider in the xy- plane a 
string fixed at points x = 0 and x = l\ 
the string is understood to be a thin 
thread that can bend freely and along 
which there is a constant tensile force 
acting. Let us disturb the equilibrium 
(a state in which the string lies along 
thex axis), that is, we shape it in a cer- 
tain way specified by a function y — 
/ (x) (Figure 10.8.1). We then let the 
string go. The question is: how do the 




ctx C a? 

Figure 10.8.1 
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tensile forces change the shape of the 
string? We will restrict our discussion 
to small deformations, or strains, of the 
string, that is, we assume that y is 
small (the deflection of the string from 
the x axis is small) and dyldx is small 
(the deformed string still closely re- 
sembles the undistorted string, so that 
the angle a = arctan {dyldx) between a 
tangent to the string and the x axis is 
small). We write dyldx instead of 
dyldx because the deflection y of the 
string changes in time too, or y — 
y {x, t) is a function of two variables, 
the abscissa x and the time t. We will 
now derive the differential equation 
that governs the behavior of y (r, t). 

Let us take a small segment of the 
string, MN, restricted by points 
M {x, y) and N (x + dx , y + dy). The 
length of this segment is obviously 

ds = V dx 2 + dy 2 = dx ]/ 1 + (-|f- ) 2 

(cf. Sections 7.9 and 6.4), since we as- 
sume the derivative dyldx to be small 
and neglect terms of the order of {dyldx) 2 
and higher. Since we are considering 
string vibrations solely in the direction 
perpendicular to the x axis, the velocity 
v of point M is dyldt and the accelera- 
tion a coincides with the second (partial) 
derivative d 2 yldt 2 , while the mass of 
the segment MN of the string is ads ~ 
adx, where a is the (constant) li- 
near density of the string (which is as- 
sumed homogeneous). The equation of 
motion of the segment MN of the string 
can be obtained if we equate the product 
ma = a dx {d 2 y/dt 2 ) to the force that 
acts on this segment. 

The only type of forces that act on the 
string is a tensile force; in other words, 
here we are dealing only with free 
vibrations of the string and disregard 
all external forces that might act on it. 
On each end of the segment MN there 
acts a tensile force R that stretches the 
segment along the respective tangent. 
The components that are perpendicular 


to the x axis (and hence cause the string 
to vibrate in the transverse direction) 
are, respectively, R sin a and R x 
sin (a + da), where a and a + da are the 
angles between the tangents to the 
string at points M { x , y) and N {x + 
+ dx , y + dy) and the x axis (see Fig- 
ure 10.8.1). The resultant of these 
components is 

R sin (a + da) — R sin a ~ Rd ((sin a) 

(cf. Section 4.1). But we already know 
that tan a = dyldx and sin a 
tan a ~ a, so that we can write 
sin a cz. tan a ~ dyldx (here we again 
ignore quantities of the order of 
a 2 (or dyldx) 2 ) or higher). For this 
reason, d sin a ~ d {dyldx) = 
{d 2 y/dx 2 ) dx , since for the time being 
t in our formulas is assumed to be fixed. 

Thus, the force that causes the seg- 
ment MN of the string to vibrate is 
close to R {d 2 y/dx 2 ) dx , and the equation 
of string vibrations can be written thus: 


a 


d 2 y 
dt 2 


dx — R 


Py 
dx 2 


dx , 


or 


d2 y x>2 d2 y 

dt 2 dx 2 * 


( 10 . 8 . 1 ) 


where c 2 = Rio (already a positive 
quantity). The solution to this equation 
must satisfy two boundary conditions 

y (0, t) = 0 and y { l , t) = 0 (10.8.2a) 

for all values of t (the endpoints of the 
string are fixed) and two initial condi- 
tions 

y (x, 0) = / (x) and 0) =-- 0 

(10.8.2b) 


for all values of x (the last initial condi- 
tion means that at t = 0 the string is 
motionless: we pull the string away 
from the position of equilibrium, and 
this is specified by the function / (x), 
and then let it go 1012 ). 


10,12 Instead of (10.8.2b) we could require 
that the velocity v (x, 0) = ( dy/dt)t= 0 °f the 
string for all xs at initial time t = 0 be 
a given function, that is, v (x, 0 ) = g (x), where 
g (x) is a known function (this constitutes 
the general problem for a fixed string; see 
Exercise 10.8.2). 
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The vibrating string equation was 
first formulated and solved by Jean 
Le Rond d’Alembert in 1747 (see Exer- 
cise 10.8.3). But of more importance 
was the somewhat later solution of this 
equation carried out by Daniel Bernoulli 
in 1753, who reasoned as follows. 
Let us seek the particular solution to 
Eq. (10.8.1) that satisfies the boundary 
conditions (10.8.2a) and the second ini- 
tial condition (dy/dt) = 0 at t = 0 
in the form of a product y (, x , t) = 
X {x) T(t) of a function X (x) that 
depends only on x and a function 
T (Z) that depends only on t. Substitut- 
ing this product into Eq. (10.8.1) 
yields 

XT" = c*X"T , or 

(10.8.3) 

where, obviously, X" = d 2 X/dx 2 and 
T" = d 2 T!dt 2 , since both X and T 
are functions of a single variable, x 
and t, respectively. But since c~ 2 T”/T 
depends only on t and X"/X only on 
x , the fact that these two ratios are 
equal means that they are independent 
of x and t , that is, 

=z '~r^f' — constant. (10.8.4) 

Clearly, if the ratios in (10.8.4) are 
equal to zero , then X" == 0 and 
X = ax -f b, which is a linear func- 
tion in variable x. But since, in view 
of (10.8.2a), X (0) = X ( l ) = 0 (be- 
cause?/ = X (x) T (t), where, of course, 

T (t) 0, since otherwise the particu- 

lar solution is of no interest at all), 
the fact that X (x) = ax + b means 
that b = 0 (since X (0) = 0) and 
al = 0, or a = 0 (since X (l) = 0). 
Thus, we have arrived at a zero solu- 
tion to Eq. (10.8.1), which is of no in- 
terest to anyone. 

If the ratios in (10.8.4) are positive, 
say, equal to k 2 , then 

^~ = k 2 , that is, X" = k 2 X. (10.8.5) 

This equation is in many ways similar 
to the equation of harmonic oscillations 


(10.2.1) . Just as with Eq. (10.2.1), 
its particular solutions can easily be* 
guessed: X x = e hx and X 2 - = e~ hx . 
These solutions yield X' = ke hx y 
x: = k 2 e hx = k 2 X x , x; = -ke~ hx , 
and X; = (-k) 2 e~ hx = k 2 X 2 (cf. Sec- 
tion 13.10). An arbitrary linear combi- 
nation of these solutions, 

X - aX x + bX 2 = ae hx + be~ hx y 

is also a (general) solution to Eq. 
(10.8.5) that satisfies any boundary 
conditions: x Q = X (0) = A, X' = 
X'(0) = B. (Why?) But the require- 
ment that conditions (10.8.2a) be 
satisfied leads to the following: 

X (0) = a + b = 0, X (■ l) = ae hl + be~ hl = 0^ 

that is, b = — a and a ( e hl — e~ hl ) —0,, 
which are satisfied only if a = 
b = 0. Thus, even the assumption 
that the ratios in (10.8.4) are positive 
does not lead to a solution to Eq.. 

(10.8.1) that is not identically zero: 
X (x) =0, and this means that 
y [x, t) = 0. 

Finally, suppose the ratios we are 
considering here are negative , that is^ 
X7X = c- 2 T"IT = -k 2 < 0, or 

^Y~= —k 2 . X"=-k 2 X, (10.8.6a> 

-±.I T -=-k 2 , T" = — c 2 k 2 T . (10.8. 6b> 

In contrast to the two previous cases, 
here we have meaningful solutions te 

( 10 . 8 . 1 ) . 

Clearly, both (10.8.6a) and (10.8. 6b)* 
are equations of the (10.2.1) type of 
harmonic oscillations for the functions 
X = X {x) and T = T (t) (the equa- 
tions differ from (10.2.1) only in nota- 
tion). From (10.8.6a) it follows that 

X — A cos kx + B sin kx (10.8.7) 

(cf. Section 10.2). Since X (0) = 0 in 
view of the first boundary condition in 
(10.8.2a), the coefficient A on the right- 
hand side of (10.8.7) is zero: X = B x 
sin kt. Substituting this X into the sec- 
ond boundary condition X (Z) = 0 
yields Bsin(kl) = 0, which for B 0 
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{the case A = B = 0 is, of course, of 
no interest to us) is possible only if 

,kl = nn , or k = ^j-n, (10.8.8) 

with n an integer. Thus, the final solu- 
tion (10.8.7) to Eq. (10.8.6a) assumes 
the form 

X = Bsin (-55- *) , (10.8.9) 

where B is arbitrary, and n is an arbit- 
rary integer. 

Similarly, the second equation of har- 
monic oscillations, (10.8.6b) (with k 
^defined in (10.8.8)) admits of the solu- 
tion 

T = C cos (ckt) -f D sin ( ckt ) 

— C cos . 

( 10 . 8 . 10 ) 

Since 

m dT cnn „ . ( cnn , \ 

T *- 5 r = — r - Csin (— 0 

+ -^Z>cos(-2p-f) , 

the second initial condition in (10.8.2b), 
{dyldt) t===0 = 0, that is, r'(0) = 0, 
which we assumed to be satisfied, yields 
D =0. Thus, 

T = C cos ( cnnt / Z) 

«and, hence, 

y — XT = b n sin (-^p- x ) cos t ) 

( 10 . 8 . 11 ) 

where b n = BC is an arbitrary number, 
and n is a natural number (which must 
be selected to fix a particular solu- 
tion). 1013 

We have thus arrived at a whole fam- 
ily (10.8.11) of solutions to Eq. 
(10.8.1), since the natural number n 
in (10.8.11) can assume any value. All 
these solutions are harmonic oscilla- 


10 - 13 The reader will recall that previous- 
ly n was assumed to be an integer and not 
only a natural number (i.e. a positive integer). 
It is clear, however, that a change of sign of n 
in (10.8.11) leads only to a change of sign of b n 
.and yields no new solutions to Eq. (10.8.1), 
while n = 0 corresponds to the trivial zero 
solution y == 0 of the equation. 


tions: each fixed point of the string (a 
fixed value x = x 0 ) oscillates according 
to the simple law 

y{x 0 ) = d n cos (a ) n Z) (10.8.12) 

with an arbitrary amplitude d n — 
d n (x 0 ) = b n sin (rmxjl) but with a 
fixed frequency co n = (cn/l)n. This val- 
ue co n of the frequency depends not 
only on the physical characteristics of 
the string (its length Z, the tensile force 
B, and the linear density a, through 
the ratio B/a = c 2 ) but also on an arbit- 
rary (but fixed for a particular solu- 
tion) natural number n : the various al- 
lowed frequencies 

c jt 2cxi 

0 ) = ^ = — , ( 0 2 = — -= 2 co, 

co 3 = 3co, ... (10.8.13) 

form an arithmetic progression — all 
frequencies are multiples of the mini- 
mum frequency co. The law of oscilla- 
tions (10.8.11) states that all the seg- 
ments of the string vibrate in unison: 
the frequencies co n do not depend on x , 
and the various segments differ only 
in the (smoothly varying) amplitudes 
d n = d n (, x ) == b n sin (nnx/l). 

We have therefore found the solu- 
tion (10.8.11) (more precisely, a family 
of solutions (10.8.11)) to Eq. (10.8.1) 
satisfying the boundary conditions 
(10.8.2a) and the second initial condi- 
tion in (10.8.2b). But how about the 
first initial condition? It is clear that, 
generally speaking, we have no control 
over this condition: although we have 
the arbitrary constant b n which can be 
chosen at will, it is obvious that the 
initial shape of the string, y ( x , 0) = 
b n sin(mtx/l), is a sinusoid rather 
than an arbitrary curve (with a half-pe- 
riod that is a multiple of Z; see Figure 
10.8.2, where n = 3). 
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So how should we proceed if the ini- 
tial shape of the string, y {x, 0 ) = 

/ (x), is not a sinusoid? The brilliant 
idea of D. Bernoulli (which was de- 
veloped in the 19th century by 
J.B.J. Fourier into a powerful method Figure 10 . 8.3 
for solving various problems in mathe- 
matical physics; see Section 10.9) con- 
sisted in reducing an arbitrary vibra- ( 10 . 8 . 1 ) can be reduced, via a method 
tion (a seemingly complicated function °f separation of variables x and t, to 
of x and t) of a string to a system of sim- ordinary (i.e. not containing partial 
pie harmonic oscillations (or “pendu- derivatives) and simple (i.e. represented 
lums”) (10.8.11), namely, D. Bernoulli by elementary functions) equations 
suggested employing the superposition (10.8.6a) and (10.8.6b). Of course, the 
principle, that is, the fact that any lin- representation (10.8.14) of the solution 
ear combination of solutions to Eq. Eq. (10.8.1) is of value only if 
( 10 . 8 . 1 ) is also a solution. [t reduces to a finite (and preferably 

Let us set up the sum few) number of terms on the right-hand 

side of (10.8.14) and yields a complete 
i/=: b A sin ( — ) cos ( CTCt ) description of the sought function 

{ 1 1 K 1 > y = y(x,t). 

-L&osin ( 2nx ) cos ( 2cnt \ 4- Bernoulli put a lot of effort into the 

* V ~i ) ' ‘V i ! ' ” study of the theory of vibrations 

(10.8.14) was well-acquainted with the proces 
expansion of an arbitrary vibration 

of the particular solutions ( 10 . 8 . 11 ) to separate harmonics (i.e. sin 
to Eq. (10.8.1). Clearly, this sum also harmonic oscillations) : / (£) 

satisfies Eq. (10.8.1), both boundary V c k cos (c o k t + <p ft ). Using his 
conditions in ( 10 . 8 . 2 a) and the second k 

initial condition in ( 10 . 8 . 2 b), since each perience as a base, he assumed t 
term in this sum satisfies the equation any function / (x) can be represen 
and the specified initial and boundary in the form of a sum (10.8.15) of han 
conditions. nics, that is, that formula ( 10 . 8 . 

Substituting t = 0 into (10.8.14) and yields the general solution to the eq 
allowing for the first initial condition in lion of string vibrations ( 10 . 8 . 1 ); 
( 10 . 8 . 2 b), we get coefficients & 1? b 2 , b 3 , ... in expans 

(10.8.14) are found from condit 
y ( x , 0) = / (x) — b t sin (10.8.15). This statement was sever 

criticized by L. Euler (see p. 0( 
+ & 2 sin 4 . & 3 s i n | + . . ., Euler argued that in the case of a s: 

' 1 * \ l J ’ pie initial function / (x), say, the < 

(10.8.15) depicted in Figure 10.8.3, where 



thus reducing the initial problem to de- 
termining from (10.8.15) the coefficients 
^i» b 2 , b 3 , ... in the representation 
(10.8.14) of a solution to Eq. (10.8.1). 
If this were to be done, it would be a re- 
markable achievement, since it would 


pull the string away from the x a 
at the string’s middle, x = Z/2, tc 
fixed (but small) height h and then 
it go, the function / {x) is given 
different formulas in the two halves 
the string: 


mean that a “continuous” vibration 
y = y (x, t) of a string can be construct- 
ed from a discrete set ( 10 . 8 . 11 ) of se- 
parate harmonic oscillations (“pendu- 
lums”) and that the general equation 
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while formula (10.8.15) gives a single 
expression for / (x) on the entire range 
from 0 to l. D’Alembert, who agreed 
with Euler, also thought it impossible 
to represent an arbitrary function by a 
single formula of the (10.8.15) type (the 
example (10.8.16) of a case that seems 
to contradict Bernoulli’s statement was 
first suggested by d’Alembert). But 
Bernoulli did not yield a bit: his expe- 
rience in mechanical phenomena told 
him that any function y (x, 0) could be 
represented in the form (10.8.15). This 
debate played an extremely important 
role in clarifying the very notion of a 
function and in representing functions 
by trigonometric series, that is, sums 
of harmonics (or harmonic oscillations) 
given by formula (10.8.15) (see Section 
10.9). 

Exercises 


10.8.1. Suppose l = jt, c = 1, and / (x) = 
sin 3 x. Find the solution (10.8.14) to Eq. (10.8.1) 
that satisfies the conditions (10.8.2a) and 
(10.8.2b). 

10.8.2. How will D. Bernoulli’s solution 
of Eq. (10.8.1) change if the initial conditions 
of Eq. (10.8.1) have the form y (x, 0) = / (x) 
and ( dy (x, t)/dt) t=Q = g (x) (the initial shape 
of the string, y (x, 0) = / (a:), and the initial 
velocity, v (x, 0) = (dy ( x , t)/dt) t= 0 = g ( x ), 
are specified, and then the string is released)? 

10.8.3. (d’Alembert’s solution of Eq. 
(10.8.1)). Prove that every function y (x, t) = 
(p (x + ct) + (x — ct), where cp and ip are 
arbitrary functions of one variable, satisfies 
Eq. (10.8.1). How must cp and be chosen so 
that the solution y (x, t) satisfies the boundary 
conditions (10.8.2a) and the initial condi- 
tions (10.8.2b)? What happens if the boundary 
conditions are those specified in (10.8.2a) 
but the initial conditions are y ( x , 0) = / (x) 
and v (x, 0) = g (x) (cf. Exercise 10.8.2)? 


10.9 Harmonic Analysis. Fourier Series 

Suppose that we have an arbitrary func- 
tion fixed on a finite interval; it will 
be convenient to assume that this func- 
tion, y = f {x), is fixed on the inter- 
val — K ^jt (see, however, p. 383 and 

Exercise 10.9.2). The problem consists 
in finding the coefficients in the expan- 


sion of f (x) in a trigonometric series, 
that is, representing f(x) in the form of a- 
sum of harmonics 1, cos x, sin x , co& 
2x , sin 2x , cos 3x, sin 3x, etc., with ap- 
propriate numerical coefficients. To* 
this end we write 

/ (x) = -y- + cos x + &i sin x-\-a 2 cos 2x 

+ b 2 sin 2x + a 3 cos 3x + b 3 sin 3x + . . . r 

(10.9.1) 

where the numbers aj 2, a 1? 6 1? a 2r 
b 2 , . . . have yet to be found (the term 
in (10.9.1) that is independent of x 
is written in the form aj 2, instead of 
a 0 , only for the sake of convenience). 

First let us note the property of 
orthogonality of trigonometric functions: 

31 

^ cos mx cos nx dx 

-jt 

jt 

sin mx sin nx dx = 0 

- jt 

for all natural numbers m and n , with 
m n\ 

jt 

cos mx s\nnxdx = 0 (10.9.2) 

— jt 

for all natural numbers m and n\ and 

jt jt 

cos 2 mx dx = j sin 2 mxdx— n 

-jt -jt 

for all natural numbers m. These for- 
mulas follow from the well-known fact 
that 

cos mx cos nx 

A 

— ~y [cos (m — n) £-fcos ( m-\-n ) x] y 
sin mx sin nx 

A 

— ~y [cos (m — n) x — cos (m -\-ri) x], 

(10.9.3) 

cos mx sin nx 

-=-|-[sin (m-f/?) x — sin (m—n) x)\ 
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particular cases of the first two formu- 
las in (10.9.3) corresponding to m = n 
are the well-known relationships 

*cos 2 mx — — (1 -f-cos 2 mx), 

\ (10.9.3a) 

sin 2 mx = y ( 1 — cos 2 mx) 

(since cos 0 = 1). To verify (10.9.2), 
we need only integrate from —ax to it 
the right-hand sides of all formulas 
(10.9.3) and (10.9.3a) employing the 
rules discussed in Chapter 5. 

Suppose that we already have the 
representation of y = / (r) in the form 
{10.9.1). The function / (x) is assumed 
known; we wish to find the coefficients 
a 0 , ffl lt b u a 2 , b 2 , etc. in the trigonometric 
series. To find a m , we multiply both 
sides of (10.9.1) by cos mx and inte- 
grate the products from — n to n. On the 

Jt 

left we have j* / {x) cos mx dx ; on the 

-Jt 

right all terms will be equal to zero 
by virtue of (10.9.2) except 

n Jt 

j a m cos 2 mx dx = a m j cos 2 mx dx 

= a m n, n (10.9.4) and j dx = 2n 

since all the other terms in the right- 
hand side of (10.9.1) after integration 
transform into 


j / (x) cos mx dx = a m z t, or 

-jt 

jt 

a m ==-^- j f (x) cos mxdx, 

771=1, 2, 3, ... (10.9.6) 

Similarly, multiplying both sides of 
(10.9.1) by sin mx and integrating 
from — Jt to Jt, we get 

jt 

j / (x) sin mxdx — 6 m Jt, 

-71 

or 

jt 

1 f 

b m =— \ f (x) sin mxdx, 

-jt 

7n=l, 2, 3, ... (10.9.6a) 

Finally, simply integrating both sides 
of (10.9.1) from — jt to Jt and employing 
the fact that 

jt jt 

j cos nxdx= j sin nxdx — 0 


yields 


S an 


cos nx cos mx dx 


j / (x) dx— j ~dx==-Y-x2n, 


-Jt 

or 


a n j cos nx cos mx dx = 0, 


(10.9.5) 


a ° = "n 1 dx ' 


(10.9.7) 


1 b » si 


sin nx cos mx dx 


sin nx cos mx dx — 0 


= K jsi 

- Jt 

^at n = 0 the first formula in (10.9.5) 

Jt 

becomes j -y- cos mxdx = -y X 

-Jt 

Jt 

^ cos mx dx = Oj . The final result is 


which is a particular case of formula 
(10.9.6) with m = 0. 10 * 14 
The formulas for the coefficients of 
a trigonometric series simplify when 
the function in question is even or odd. 
For an even function, / ( — x) = f (x), 

Jt 

j / ( x ) sin mx dx= \ f (x) sin mx dx 


10 • 14 It is precisely for this reason we wrote 
aj 2 in the right-hand side of (10.9.1) 


instead of a 0 . 
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n 

+ J / (x) sin mxdx 

o 

o 

_ j / ( — x) sin ( — mx) d ( — x) 

-jt 

Jt o 

+ j / (a:) sin mxdx= J / (x) sin mx dx 

0 

Jt 

4- ^ / (x) sin mx dx = 0, 
o 

since / {—x) = f (x), sin (—mx) = 
— sin x, andj d ( — x) — — dx. 
Therefore, for an even function we have 

/ (x) = + a t cos x + a 2 cos 2x 

+ u 3 cos3a:+ . . (10.9.8) 

where 

JT 3T 

«o=4" j / =4 1 /(,t) dx ' 

-JT 0 

JT 

am== ~k 5 fl{x) cos mxdx 

-Jt 

0 

== -i- ( j / (x) cos mxdx 

- JT 
JT 

-f j / (x) cos mx dx j 

o 

jt 

= j / (x) cos mx dx. (10.9.9) 

o 

In a similar manner, if we are deal- 
ing with an odd function, / ( — x) = 
— / (x), then 

JT 

a m = 0 and b m = ^ j / (x) sin mx dx. 
o 

(10.9.10) 

(Prove this!). The result is 
/ (x) = sin x + b 2 sin 2x + 

& 3 sin 3x + ... (10.9.11) 



Thus, even functions can be expand- 
ed in a trigonometric series (10.9.1) 
containing only cosines, while odd func- 
tions can be expanded in a trigonomet- 
ric series containing only sines. 

Example 1. Suppose that 

f ji + x for — jts^x^O, 

^ ^ 1 jt — x for O^x^ji 

(Figure 10.9.1a; cf. Figure 10.8.3). This 
function is even , and therefore can be 
expanded in a trigonometric series con- 
taining only cosines, with coefficients 

jt jt 

®« — ~ j f(x)dx = ^~ j (n — x)dx 
o o 


/ 

£ 2 \ 

jt 

J2 

fa Ji 2 \ 

1 Jtx - 

--7T- I 




2 I 

0 

Jt 

^ 2 y 


JT 

a m = j (jt — x) cos mx dx 
o 

jt 

= ( n j cos mx dx 

o 

jt 

— ~ [n ^ cos mx dx 

o 


j x cos mx dx 
o 


> 
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n 



= f k j cos mx dx — ~ (x sin m)\ 
o 

4- — \ sin mx dx 1 = — ( — sin 
1 m J J n \ m 


o 


sin mx 

m 


1 

— cos mx 


{ 


m 2 

0 for m even, 

4 for m odd. 


“) 

o / 


nm 2 

The final result is 

/(*)=4+4 C0SX +"Jr C0S 3x 


25ji 


•cos 5#+ • • • . 


(10.9.12) 


Figure 10.9.1a presents three func- 
tions that successively approximate 
/ (;z), which is depicted by a heavy so- 
lid line: the function cp 0 (x) = k/2 (the 
horizontal dotted line), (p x (x) = k/2 + 
(4/k) cos x (the dashed curve), and 
cp 2 {x) = k/2 + (4/k) cos x + (4/9k) X 
cos 3x (the thin solid curve). 

Example 2. Suppose that 
[ 1 for 0 C x^n, 

8 ( X ) — | — | f 0r — K^X < 0 


(Figure 10.9.16). This is an odd func- 
tion, and therefore it can be expanded 
into a trigonometric function contain- 
ing only sines, with coefficients 



sin mx dx = ( — cos mx) 

ran ' ' 


{ 


— - — for m odd, 
mn 

0 for m even, 


31 

0 


so that 

, x 4 / . . sin 3a: 

£(*)=- (sin *+— 


sin 5a: . \ 

(10.9.13) 


Figure 10.9.16 presents three func- 
tions that successively approximate 


g (x), which is depicted by two heavy 
solid lines: if)! (x) = (4/k) sin x (the 
dotted curve), {%) — (4/k) x 

(sin x + (sin 3x)/3) (the dashed curve) ^ 
and i|) 3 (x) — (4/k) (sin x + (sin3x)/3 
+ (sin 5x)/5) (the thin solid curve). 

Up till now we spoke only of func- 
tions defined on the interval — k^ 
x ^ k. But the case of a function 
defined on an arbitrary interval — Z^ 
x ^ l introduces nothing new, since- 
it is obvious that we can always in- 
troduce a new variable in such a way 
that the new interval will be reduced 
to the interval from — k to k (cf. Sec- 
tion 1.7). 

Let us introduce a new variable, 
x 1 = {nil) x, or x = (Z/k) x x . Then 
/ (x) can be rewritten thus: / {x ± l/n) = 
fi ( x i)> where the function / x of the new 
independent variable x x = {nil) x is 
defined on the interval — n ^ x Y k. 
The expansion (10.9.1), in which x~ 
and f{x) are replaced with x x and 
f{x i), generates the following expan- 
sion of the initial function y = f {x):~ 

y = x+«i cos [t x ) + bi sin (t x ) 

+ a 2 cos ( ~~ a: j -|- b 2 sin ( — — a: j 

+ a 3 cos (-y- x'j + b 3 sin(-y - x ) + • • •»- 

(10.9.14) 

with 

a-h = -f j rt( x ) cos (~r) dx > 

~j (10.9.15). 

b k = j sin (^T~) dx 

-l 

(see Exercise 10.9.1). 

Note also that a function y = f (#)' 
defined, say, in the interval 0 ^ x ^ l 
can always be continued onto negative' 
values of x by setting / (— x) = f (#),. 
which results in the following represen- 
tation for f {x): 

f(x) = ’Y + a l cos (y a;|'+<i 2 cos ( — x ) 
+ a 3 cos xj -f- . . . r (10.9. 16)> 
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where, obviously, 

' Uk m T j / ^ cos (^T~) dx ' 

0 

,k = 0, 1, 2, ... . (10.9.17) 

'(Why? See Exercise 10.9.1.) The func- 
tion / {x) can also be continued onto 
negative values of x by assuming that 
j (—x) = — / (x). This leads to a sine 
expansion of / (x): 

f (x) = b { sin (y ;r)+& 2 sin (-y- s) 
fe 3 sin -)-••• » (10.9.18) 

where 

= yj* / (*) sin 

0 

.ft - 1, 2, 3, . . .. (10.9.19) 

We encountered formula (10.9.18) when 
we studied string vibrations in Section 
10.8 (see formula (10.8.15) for the func- 
tion y (x, 0 ) = / (#)). 

Note that the right-hand side of Eq. 
(10.9.1) can be understood as a function 
^defined not only on the interval from 
— k to k but on the entire number axis, 
.since all trigonometric functions on the 
right-hand side of (10.9.1), cos x, sin x, 
cos 2x, sin 2x, cos 3x , sin 3x, etc., are 
periodic with the least common period 
‘2k: if cp (x) is any one of the functions 
considered, then cp (x + 2kn) = cp (x) 
for every integral k. For this reason the 
:sum of the series on the right-hand side 
of (10.9.1) (we will, as usual, write 
this sum as / (#)) is a periodic function 
'with a period of 2k, or / (x + 2 kn) = 
j (x) for all integrals ft, since all 
terms on the right-hand side of (10.9.1) 
retain their values if we replace x 
'with x + 2 kn. Similarly, the right-hand 
;side of the more general formula 
(10.9.14) defines a periodic function 
with a period of 21. It is even some- 
times said that the formulas (10.9.1) 
and (10.9.14) de-fine the expansion of 
periodic functions with periods 2k and 
:2Z, respectively. For instance, in Figure 



(b) 


Figure 10.9.2 

10.9.2 we see the (“complete”) functions 
/ {x) and g (. x ) of Examples 1 and 2. 

Of course, all the representations of 
functions in the form of infinite sums of 
trigonometric functions, that is, for- 
mulas (10.9.1), (10.9.8), (10.9.11), 

(10.9.14), (10.9.16), and (10.9.18), or, 
to use a term employed many times, in 
the form of trigonometric series , are 
valuable only if the respective series 
converge , if one is to use the terminology 
of Section 6.3. All this means that a 
sum of a finite number of terms (pre- 
ferably, a small number of terms) in, 
say, the series (10.9.1), 

s n = s n ( x ) — yr + a i cos xJ rb 1 sin x 

+ a 2 cos 2x + & 2 sin 2x-\- . . . -f a n cos nx 
+ b n sin nx , (10.9.20) 

provides a fairly good approximation 
to the right-hand side of (10.9.1) and 
that, as n 00 , the sum s n tends to 
/ (x), so that the greater the n, the more 
exact is the approximate equality 

/ (x) ~ s n (x). (10.9.21) 

In the debate of D. Bernoulli with 
L. Euler and J. d’Alembert it was 
Bernoulli who was right: it has been 
established that practically any func- 
tion (periodic or defined on a finite 
interval) can be expanded in a trigono- 
metric series. 10 15 This assertion means 


10 - 15 For it to be impossible to expand 
a function / (x) in a trigonometric series, 
/ (x) must have on the interval on which it 
is defined (or over one period if / (x) is perio- 
dic) an infinite number of discontinuities (or 
jumps) or an infinite number of maxima and 
minima. 
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that, say, each function / {x) that is 
defined on the interval from — n to n 
(or, which is the same, each periodic 
function / (x) with a period of 2jc) can 
be represented in the form (10.9.1), 
that is, the approximate equality 
(10.9.21), with the right-hand side de- 
fined by (10.9.20), is valid. At points 
where the function f (x) is continuous, 
the sums s n are close to f (x) (obviously, 
the greater the number n, the closer s n 
is to / (x)), while at points where f(x) 
has a discontinuity, or where it experi- 
ences a jump, so that its value on the 
left, f(x — 0), differs from the value 
on the right, f (x + 0), the sum s n 
approaches, as n — >• oo, the arithmetic 
mean (1/2) [/ (x — 0) + / {x + 0)] of 
these values. 10 16 The rate of convergence 
of the sums s n {x) to the function / (x) 
depends strongly on the behavior of the 
function: it is the highest in the case of 
smooth functions; the presence of dis- 
continuities in the derivative of the 
function (see Example 1) reduces the 
rate of convergence, while if the func- 
tion has discontinuities (as in Exam- 
ple 2), the rate is reduced still further, 
which requires using a large number of 
terms in the trigonometric series to 
represent the function with a given 
accuracy. 

Let us explain, without laying claims to 
any mathematical rigor whatsoever, the rea- 
sons why the “degree of smoothness” of a func- 
tion / (x) affects the rate of convergence of the 
trigonomectric series representing the func- 
tion. It is clear that to be able to discard in 
the right-hand side of representation (10.9.1) 
all terms except a known (small) number we 
must ensure that the discarded terms rapidly 
decrease, so that even an infinite number of 
such terms has little effect on the result. And 
since the sine and cosine are bounded functions 
(neither can exceed unity in absolute value), 
the coefficients defined by (10.9.6) and (10.9.6a) 
must decrease rapidly as m->oo. But for smooth 
functions the decrease of these coefficients 
follows readily from the oscillatory (with al- 
ternating signs) nature of the functions cos mx 


10,16 For thej function considered in Ex- 
ample 2 we must assume that g (0—0) = — 1 
and g (0 + 0) = 1; the sums s n (x) at x — 0 
converge, as n-+ oo, to the value 
(1/2) [(-1) + 1] = 0. 


and sin mx present in the integrands in the 
^famrdras f iui ~u m "cruL u u m . 

Indeed, if m is high, the period of, say, 
sin mx proves to be so small that the (smooth) 
function / (x) has no time to vary appreciably 
over the period. This means that the fraction 

of ^ / (x) sin mx dx (which yields b m ) corre- 
sponding to one period of sin mx will be prac- 
tically zero, since / (x) may be assumed con- 
stant and the values of the integrals corre- 
sponding to the two half-periods of sin mx 
will almost completely cancel out (since they 
have opposite signs and their absolute values 
are practically the same). 

A more exact calculation shows that, if m 
is high, the sum of all the fractions of the in- 
tegrals in the right-hand sides of (10.9.6) and 
(10.9.6a) corresponding to separate waves of 
the cosine y — cos mx and the sine y — 
sin mx proves to be extremely small; this en- 
sures the smallness of the expansion coefficients 
in (10.9.1) and, therefore, the convergence of 
the series in (10.9.1). If / (x) is not smooth, 
that is, / (x) suddenly changes its direction at 
certain points (see Figure 10.9.2a), the decrease 
in the expansion coefficients proves to be not 
so rapid, compared to the case where the func- 
tion is smooth everywhere. The convergence is 
even worse if the function experiences a jump 
(see Figure 10.9.26), in the vicinity of which 
the function / (x) in no way can be considered 
a constant. Indeed, we see that the coefficients 
brn corresponding to the discontinuous func- 
tion g (or) defined by (10.9.13) decrease only 
like i/m as m increases, while in the case of 
the continuous function of Example 1 the 
coefficients a m in (10.9.12) decrease, roughly 
speaking, like i/m 2, as m oo; for a smooth 
function the rate of decrease of the coefficients 
in the trigonometric series corresponding to 
this function is even higher. 

Let us take up once more the question 
of the famous debate between 
d’Alembert and Euler, on the one hand, 
and D. Bernoulli, on the other, concer- 
ning the vibrations of a string whose two 
ends are fixed. Particular solutions 
(10.8.11) of this problem were obtained 
in 1713-1715 by Brook Taylor (already 
mentioned earlier in this book), who 
assumed that no other solutions were pos- 
sible (this assumption, however, lacked 
foundation). D. Bernoulli’s achieve- 
ment lay in the fact that he assumed 
that separate vibrations, (10.8.11), can 
mix, or that a string can generate tones 
of various frequencies simultaneously. 
The fundamental tone corresponds to 


25-0946 
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the first frequency co, while the other fre- 
quencies, 2co, 3co, etc. (see 10.8.13)), 
correspond to overtones , or higher har- 
monics. 10 17 It was the richness of the 
physical content of the string vibration 
problem provided by the representation 
(10.8.14) of the function y (x, t) and 
the complete agreement of the results 
with the appropriate experimental data 
that made Bernoulli so sure of his re- 
sults and enabled him to withstand the 
criticism of his friend Euler (whom he 
deeply respected) and of d’Alembert. 

Further developments of Bernoulli’s 
method are connected primarily with 
the name of J.B.J. Fourier, who was the 
first to give a coherent general theory 
for expanding functions in trigonomet- 
ric series, a theory based on the simple 
formulas for the expansion coefficient 
obtained above. He also provided many 
examples of expansions of concrete 
functions. But even more important 
were his applications of such expansions 
to specific problems of mathematical 
physics, say, the problem of heat pro- 
pagation. These applications were based 
on the same idea as the one put for- 
ward by Bernoulli: the initial condi- 
tions in the problem, conditions spec- 
ified by an arbitrary function / {x), 
were first assumed by Fourier to be 
sinusoidal, that is, he dealt only with 
functions of the sin mx or cos mx type. 
With such simple initial conditions 
solving the problem was relatively easy, 
while the general solution was then 
found via the superposition principle 
and the possibility of expanding / (#) in 
a trigonometric series. And it appears 


10 * 17 Already in the fifth century B.C. the 
Pythagoreans of ancient Greece studied the 
relationships between the fundamental tone 
and the overtones of a vibrating string, which 
relationships reduce to simple ratios of the 
lengths of the periods of the corresponding 
sinusoidal waves, and the relation of these 
integral ratios to the euphony of the sounds 
produced by the string. The results obtained 
by the Pythagoreans played an extremely 
important role in the formation of their 
belief in the simplicity and cognoscibility of 
the world. They also considered mathematics 
the key to studying nature. 


quite natural that although many scien- 
tists before Fourier (Euler, D. Ber- 
noulli, Gauss, and others) gave exam- 
ples of expanding functions into trigo- 
nometric series, all such series are 
today called Fourier series. 

The procedure by which a function 
f(x) is expanded in a Fourier series (or 
in a trigonometric series) is known as 
Fourier analysis . For example, the ex- 
pansion (10.9.12) of the function / (x) 
of Example 1 shows that this function 
consists of a constant term Jt/2 and har- 
monics cos x , cos 3x, cos 5x, etc. taken 
with coefficients 4/jt, 4/9ji, 4/25 k, etc., 
respectively, while the function g (x) 
of Example 2 consists of harmonics 
(4/jt) sin x , (4/3jt)sin 3x, (4/5n) sin 5x> 
etc. (see the expansion (10.9.13)). 

Fourier analysis of (periodic) functions 
plays an extremely important role in mathe- 
matics and applications (for one, in the study 
of various oscillatory processes); it is often per- 
formed automatically via so-called Fourier 
analyzers . Fourier analysis is sometimes called 
spectrum analysis, which of course stems from 
the fact that ordinary light consists of waves 
with definite wavelengths (or frequencies or 
periods), and a definite wavelength corresponds 
to a definite color in the spectrum. Generally, 
the expansion of an arbitrary (nonperiod- 
ic) function contains harmonics of the 
a ((d) cos (at + cp (to)) type corresponding to 
the entire range of frequencies ( 0 , so that we are 
forced to replace the Fourier series with the 
Fourier integral over this range (the case of a 
function having a continuous spectrum). Here, 
however, we will not dwell on this more com- 
plicated question. 10 - 18 


Exercises 

10.9.1. Prove formulas (10.9.10), (10.9.15), 
(10.9.17), and (10.9.19). 

10.9.2. Expand a function y = / (x) de- 
fined in an arbitrary interval a c x <: b in a 
trigonometric series. [Hint. Substitution x — 

a Xl 4- transforms / (x) into a func- 

tion / x (x x ) defined on the interval —ft < 
x x < n.] 

10.9.3. Expand the following functions in 
trigonometric series: (a) y = x 2 , — jt C x c jt; 
(b) y = Cl x for — l < x < 0 and y = c 2 x 
for 0 < ^ < Z; and (c) y = sin x for 0 < z < zi 
and y = 0 for — n 


10 - 18 The interested reader is referred to 
the book [15]. 




Chapter 11 The Thermal Motion of Molecules. 

The Distribution of Air Density in the Atmosphere 


11.1 The Condition for Equilibrium 
in the Atmosphere 

Let us consider the question of the law 
of distribution of air density in the at- 
mosphere in altitude. It is common 
knowledge that at high altitudes the 
air is less dense and the air pressure is 
lower than at sea level. The reason for 
the dependence of pressure upon alti- 
tude is obvious. Let us mentally select 
a cylindrical volume (altitude Ah , base 
area S , volume SAh). The air in this 
volume (mean density p, mass pSAfe) is 
attracted to the earth, that is, experi- 
ences the force of gravity directed down- 
ward and equal to mg = pSgAh. How- 
ever, this volume does not fall and is 
in a state of rest for the reason that at 
altitude h it is acted upon from below 
by a pressure p (h) that exceeds the 
pressure from above at altitude h + A h, 
which pressure is equal to p (h + Ah) 
(Figure 11.1.1). The pressure on the low- 
er base of the cylinder is Sp ( h ); it 
balances the sum of the pressure on the 
upper base and the force of gravity: 

Sp (h) = Sp{h+ Ah) + pSgAh. 

( 11 . 1 . 1 ) 

Formula (11.1.1) can be rewritten as 
p (h) — p (h + Ah) = pgAh. (11.1.2) 
We will assume that Ah is very small. 



Figure 11.1.1 


Then instead of p we can simply speak of 
the density p, since the altitudes h and 
h + Ah are almost the same and p 
differs very little from the density 
p (h) at altitude h. Therefore, assuming 
that on the left of (11.1.2) we have — dp 
and replacing Ah with dh, we obtain 

&=-*>• (M-l-3) 

We have thus obtained a differential 
equation for the dependence of pressure 
p = p (h) on height. This equation also 
involves the air density p. 

We will assume that the temperature 
of the atmosphere is the same at all 
altitudes. Actually, the air temperature 
depends on the heat flux from the sun 
and the removal of heat due mainly to 
heat radiation by the air into outer 
space or, to be more exact, by the 
water vapor and carbon dioxide in the 
air. A small portion of the solar radia- 
tion is absorbed by the upper rarefied 
layers of the air, while the larger portion 
reaches the earth, heats the ground, 
and then the ground heats the air. 
This explains the actually rather com- 
plicated distribution of temperature in 
the atmosphere: at ground level the 
temperature is known to fluctuate rough- 
ly between —40 °C to +40 °C depend- 
ing on the geographical location and time 
of year; at an altitude of about 15 km 
the temperature is minimal (about 
— 80 °C) and is approximately the same 
both in summer and winter round the 
globe. At considerable altitudes the 
air temperature increases, reaching 
+60 °C to +75 °C at altitudes be- 
tween 50 and 60 km. 

Recent measurements by means of 
artificial earth satellites show that at 
altitudes of 300 to 1000 km the air den- 
sity is low but still is greater than was 
earlier thought. As we will see later on, 
high air density indicates a very high 
temperature of the air in these upper 
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strata. What is more, a substantial 
portion of the molecules of oxygen and 
nitrogen break up at these altitudes in- 
to atoms, ions, and electrons. 

If there were no influx of heat from 
without or any removal of heat, that 
is, if we were to consider a heat-insulated 
column of air, then the temperature 
throughout the column would eventual- 
ly even out. Below we will consider 
precisely this kind of an idealized case 
of total equilibrium, both thermal and 
mechanical. Thermal equilibrium means 
that the temperature is everywhere 
the same and so there are no fluxes of 
heat (if the temperature differed at dis- 
tinct points in the air column, the heat 
would move from the hotter points to the 
colder ones, with a resultant flow of 
heat). Mechanical equilibrium consists 
in the resultant of all the forces acting 
on any volume of air chosen in the at- 
mosphere being equal to zero. Here we 
have to consider the force of gravity on 
the air in the volume and the pressure 
on the entire surface bounding the given 
volume. 

For the pressure distribution that sat- 
isfies Eq. (11.1.3), the atmosphere can 
be in a s±ate_ o L rest.. 

Since we consider altitudes h small by 
comparison with the earth’s radius, g 
(the acceleration of gravity) can be 
regarded as constant. 


11.2 The Relationship Between Density 
and Pressure 

Equation (11.1.3) contains two un- 
known quantities: the pressure p and the 
density of air, p. We must therefore 
start by establishing a relation between 
them. 

By Boyle's law the product of the 
pressure of a gas and the volume occupied 
by this gas is constant for a given mass m 0 
of the gas and for a given temperature: 
pV = a, where a is a constant. If the 
gas density is p, then m 0 = Vp. Hence, 
V = m 0 l p, and since p = atV , we can 
write 

p = b Pl (11.2.1) 


with b = a/m 0 . Thus the gas pressure 
is directly proportional to the density . 

It is easy to find the constant of pro- 
portionality for air at room tempera- 
ture. We know that the air pressure at 
sea level, p 0 , is roughly equal to 
10 5 N/m 2 (= 10 5 Pa). The air density p 0 
at pressure p Q is equal to 1.3 kg/m 3 , 
approximately. 11 1 Substituting p 0 and 
p 0 into (11.2.1), we get p 0 = 6p 0 , 
whence 112 

b ~ 10 5 /1.3 ~ 7.7 x 10 4 m 2 /s 2 . 

Later on we will need not only the 
numerical value of b for air at room 
temperature but also the general expres- 
sion of the constant b for any gas and 
any temperature. To this end we take 
advantage of the ideal gas law (some- 
times called the Clapeyron ideal gas 
law) 

pV = RT , (11.2.2) 

where V is the volume occupied by one 
gram-molecule (or simply one mole) 
of gas, T is the absolute temper- 
ature (reckoned from absolute zero, 
—273.16 K) 11 - 3 , and R is the molar gas 
constant (sometimes called the univer- 
sal gas constant ). We know that at 0 °G 
(equal to approximately 273 K on 
the absolute scale) and at atmospheric 
pressure at sea level, p 0 = 10 5 Pa, one 
mole of gas occupies a volume equal to 
22.4 liters, or 2.24 x 10“ 2 m 3 ( Avogad - 
ro's law), whence 10 5 x 2.24 x 10~ 2 = 
273 R , or 

R ~ 8.3 (N /m 2 ) • m 3 /mol • K 

= 8.3 J/moLK. 

11 - 1 This quantity can easily be found 
experimentally by weighing. A hermetically 
sealed vessel of known volume is weighed 
with and without air (a vacuum pump is used 
to evacuate the vessel). 

11 • 2 Observe that b has the dimensions of 
the square of velocity. Actually this quantity 
is closely connected with the velocity of 
molecules and sound: the square of the speed 
of sound is equal to 1.46 (we will not derive 
this relation). 

11 • 3 Ordinarily, temperatures on the abso- 
lute scale are measured in kelvins (symbol- 
ized K), after the English scientist Lord Kel- 
vin: 20 °G = 293 K (read: “20 degrees Celsius 
is equal to 293 kelvins”). 
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We denote the fractional molecular potential energy of one molecule in the 
weight of the gas by M. For hydrogen field of gravity at temperature T. The 


we have M = 2, for helium M = 4, for 
nitrogen M = 28, and for air the mean 
value of M is 29.4. By definition, V 
contains M grams, or M x 10“ 3 kilo- 
grams, of substance. This means that 
the density p is connected with V by 
the relation 

p - 10" 3 M/V, or V = 10- 3 M/p. 

Here density p is expressed in kg/m 3 
and volume V in m 3 . Substituting this 
expression for V into (9.2.2), we get 

P = 10»p^l. (11.2.3) 

Comparing this with (11.2.1), we find 
that 

b = 10*^. (11.2.4) 

Finally, let us express the pressure in 
terms of the number of molecules n 
contained in a unit volume of gas. 
We know that one mole of any substance 
contains about 6 x 10 23 molecules. 
This quantity is known as Avogadro's 
number and is denoted by A. Thus, the 
mass of one molecule (in kilograms) is 

m = 4xl0 (11.2.5) 

If one mole of gas occupies volume V , 
then the number of molecules per unit 
volume is n = A/V . The gas density 
p = nm. The ideal gas law (11.2.2) 
yields 

RT 1 rp 

p — n -j— = nkT, 


mean kinetic energy of translational 
motion of one molecule is (3/2) kT. 

11.3 Density Distribution 

From formula (11.2.1) we find p = p/b . 
Putting this ratio into the differential 
equation (11.1.3) for the air pressure, 
we get 


The solution to the equation is p = 
Ce~( f /b >\ where C must be deter- 
mined from the initial condition. Let 
p = p 0 at h = 0. Then 

p = Po e-(g/b)h' (11.3.1) 

Dividing both sides of (11.3.1) by 6, 
we get 

p = p^-(*/*>\ (11.3.2) 

where p 0 is the air density at h = 0 (at 
sea level). From formula (11.3.1) it is 
evident that at altitude H = b/g above 
sea level the air pressure diminishes by 
a factor of e . Using H, we can rewrite 
(11.3.1) and (11.3.2) thus: 

p = pf-'**, p = p 0 ^. V^o?6 } 

Let us compute H using the formula 
H = b/g: 

H ~ (7.7 x 10 4 )/10 
= 7.7 x 10 3 (m 2 /s 2 )/(m/s 2 ) = 7.7 km. 


where k is the Boltzmann constant : (This value of H to a mean 

temperature around 20 C) 

k= 4- ~ a 8 - 1.38 x 10 -23 J/K. If the altitude is increased in arith- 
A metic progression, the pressure and 

The quantity R refers to a conven- density fall in geometric progression: 
tionally chosen amount of substance, one p = p 0 and p = p 0 at h — 0; if h = H, 
mole, and so the dimensions of R in- then p = pje — 0.368 p 0 ana P ^ 
volve the mole. The quantity k refers 0.368p 0 ; if h = 2 H, then p = 
to one molecule, and so k has the dimen- pje 2, ~ 0.135^ 0 an ^ P — ®’~~?P°’ 
sions of J/K. The quantity kT has the if h = 3 H, then p = pje cz. 0.05 p Q 
dimensions of energy (J). In Section 11.4 and p ^ 0.05p 0 ; and so on. 
it will be shown that in the atmosphere We obtain a formula relating H and 
the quantity kT is equal to the mean kT. We know that H~b/g. Using 
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(11.2. 4) and (11.2.5), we arrive at 


H= 10 3 


RT 

Mg 


RT 

Amg ’ 


or 


H 


kT 
mg * 


(11.3.4) 


Knowing the dependence of density 
on altitude, we can express the total 
mass of air, m a , in a column with base 
area of 1 m 2 . Indeed, 


oo oo 

tn a = j pdh = j p 0 e~ h l H dh. 
o o 

Make the change of variable z — hlH. 
Then dz = (i/H) dh and 

oo 

m a = p 0 H j e~ z dz= — p 0 He' z \™ = p 0 H. 

0 

Using the relation m a = p 0 H we can 
compute H once again (by way of a 
check). Since the atmospheric pressure at 
sea level is approximately 10 5 Pa = 
10 5 N/m 2 , the mass of air in a column 
with base area of 1 m 2 presses against the 
earth with a force of 10 5 N, that is, this 
mass is approximately 10 5 /g ~ 10 4 kg. 
Thus, m a = 10 4 kg/m 2 . Knowing that 
p 0 ~ 1.3 kg/m 3 , we get 


H = J2s. =, i°l ~ 7.7 x 103 m = 7.7 km, 
p 0 1.3 


in accord with the earlier computation. 

Finally, let us find the mean altitude h 
at which the air is located, that is, the 
altitude of the center of gravity of a ver- 
tical cylindrical column of air. So as 
not to introduce extra quantities, we 
consider a column of air with base area 
of 1 m 2 , although it is clear that the 
altitude of the center of gravity is 
independent of the base area of the cyl- 
inder. At a height between h and 
h + dh is a mass of dm = p dh. The 
mean altitude is 

oo oo oo 

h = j h dm = \ hp ( h ) dhj j p (h) dh. 

oo 0 

Let us find the integral in the numera- 
tor. Using the second formula in 


(11.3.3) and substituting z for hlH, 
that is dh — Hdz , we get 

oo oo 

j hp (h) dh = j hp 0 e~ h / H dh 

0 0 

oo 

= p 0 H 2 j ze~ z dz = p 0 # 2 

o 


( since via the formula of integration 


by parts, 


j ze~ z dz = — ze 2 
o 


+ 


J loo 

e - z dz= (— — e~ z ) =1; compare 

o 

with formula (7.6.19) j . The final result 


is 


h=^-=H. (11.3.5) 

Po R 

Thus, the altitude H at which the den- 
sity and the pressure of the air diminish 
e-fold is at the same time the mean 
altitude at which the air is located. 

A similar result was obtained earlier 
when we considered radioactive decay 
(Section 8.1): if the probability of de- 
cay is co, with dnldt = — c on and n = 
^ 0 e _cd< , then during time t = 1/co 
the amount of radioactive substance 
decreases by a factor of e , and the mean 
lifetime of a radioactive atom is equal 
to the same quantity: t = x = 1/co. 

Remember that the simple depen- 
dence of density and pressure on alti- 
tude, (11.3.3,) refers to the case of a con- 
stant temperature. Actually, the dis- 
tribution of density and pressure departs 
somewhat from formula (11.3.3) and 
depends on the time of year and other 
factors. 


Exercises 

11.3.1. Find the air pressure in a mine at 
the following depths: 1 km, 3 km, and 10 km. 

11.3.2. Find the dependence of air pres- 
sure on altitude for air temperatures equal to 
—40 °C and +40 °G. 

11.3.3. Suppose the air temperature varies 
with altitude by the law dT/dh = —a T 0j 
where T 0 is the air temperature at the earth’s 
surface and a is a constant coefficient. Find the 
air pressure as a function of altitude. 
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11.3.4. We know that under the conditions 
of exercise 11.3.3 the constant a is approxi- 
mately equal to 0.037 X 10 -5 cm -1 . Using 
the result of exercise 11.3.3, determine the air 
pressure in a mine at depths of 1 km, 3 km, 
and 10 km. The temperature at ground level is 
taken to be 0 °C. Compare the results with 
those of exercise 11.3.1. 

11.4 The Molecular Kinetic Theory 
of Density Distribution 

In the preceding sections we found the 
distribution of air density in altitude 
under the action of gravity in a state 
of equilibrium. We regarded the air as 
a continuous medium with a given 
dependence of pressure on density. 

Now let us take that result and ap- 
proach it from another angle, namely, the 
viewpoint of molecular theory. We will 
consider the separate molecules and their 
motion. The idea that matter consists 
of individual atoms was first expressed 
in ancient Greece. However, the motion 
of molecules and its connection with 
heat was first examined by the great 
Russian scholar M.V. Lomonosov, who 
is thus the founder of the molecular 
kinetic theory. 

The gaseous state differs from the 
liquid and solid states in that in a gas 
the molecules may be regarded as inde- 
pendent and noninteracting. The motion 
of molecules in a gas is that of free 
flight by inertia. From time to time 
the molecules collide. Under ordinary 
conditions, such collisions occur with 
extreme frequency and the path lengths 
which molecules traverse between 
collisions are extremely small. 

At atmospheric pressure and a tem- 
perature of 0 °C, 22.4 liters of gas com- 
prise one mole of a substance, or 6 x 
10 23 molecules, while 1 m 3 of gas con- 
tains n ~ 6 x 10 23 /2.24 x 10 -2 
2.7 x 10 25 molecules. 

For our crude purposes we will regard 
molecules as spheres of radius about 
2 x 10 -10 m. 11 * 4 Then for a collision 

11 • 4 In reality, diatomic molecules, say of 
oxygen or nitrogen, are more like pairs of 
merged spheres, something reminiscent of 
peanuts (two nuts to a shell). 


between two molecules to take place it 
is necessary that the trajectory of the 
center of one molecule hit a target of 
radius 4 x 10 -1 ° m about the center of 
the other molecule. The area of such 
a target is or = nr 2 5 x 10 _l9 m 2 . 
This means that over a path length of 
1 m given molecule will collide with 
all molecules whose centers lie in 
a cylinder with base area 5 x 10 " 19 m 2 
and altitude 1 m. The volume of such 
a cylinder is equal to am 3 , and the 
number of molecules in it is ncr, where n 
is the number of molecules in 1 m 3 . 

Thus, a molecule experiences no col- 
lisions over a path of 1 m. Therefore, 
the mean distance of free flight between 
collisions is 

l = — ~ 7.5 x 10“ 8 m. 

no 

This quantity is known as the mean 
free path . 

Because of collisions, a molecule 
moves in a polygonal line, but the volume 
of a cylinder formed from polygonal 
lines differs but little from that of a right 
cylinder and so our computations re- 
main valid. 

Actually, one has also to consider 
the motion of those molecules that are 
hit in collisions. It can be proved that 
this circumstance changes but slightly 
the mean free path of a molecule, reduc- 
ing it by a factor of only 1.5, so that 
in what follows we will assume that 
i ~ 5 x 10~ 8 m. 

Molecules have velocities v of the 
order of 300 to 500 m/s. Hence, the 
mean free path time, that is, the mean 
time between collisions, is of the order 
of 


At first glance, the quantities l ~ 
5 x 10" 8 m and t ~ 10 _1 ° s are 
extremely small. But they have to be 
compared with the size of a molecule, 
whose radius is r 2 x 10 -10 m, and 
with the duration of the collision itself, 
which is less than r/u ~ 10 ~ 13 s. If 
we do that, it will be apparent that the 
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molecules of a gas collide very rarely: 
at atmospheric pressure, the molecules 
of air spend 99.9% of the time in free 
flight and only 0.1 % of the time in a 
state of collision. 

Molecular collisions in a gas do not 
affect the pressure of the gas and do not 
influence the law of distribution of den- 
sity of the gas in the atmosphere. Con- 
firmation of this fact lies in Boyle’s law 
and the ideal gas law. In Section 11.2 
these laws are written as p = nkT. 

The gas pressure depends on the num- 
ber of molecules per unit volume, but 
the radius r of the molecules and their 
cross section a do not enter into the 
formula. This means that the quantities 
r and a cannot enter into the formula 
for the density distribution in altitude. 

Let us rewrite the formula for density 
distribution (11.3.2) by expressing b in 
terms of molecular quantities. Since 
b = 10 3 RTIM = AkT/Am = kT/m , 
we can write 

p = p 0 e ~ b = p 0 e ~ mgfi/kT ( 11 . 4 . 1 ) 

(compare with (11.3.3) and (11.3.4)). 

Divide both sides of (11.4.1) by m , 
where m denotes the mass of one mole- 
cule. Note that p/m = n is the number 
of molecules in unit volume at altitude 
h , while p Jm — n 0 is the number of 
molecules in unit volume at sea level. 
Formula (11.4.1) assumes the form 

n = n 0 e~ mgh ' /hT . (11.4.2) 

The quantity mgh is the potential 
energy of a molecule of mass m located 
at altitude h if for zero we take the 
potential energy of a molecule at sea lev- 
el, since the potential energy of a mol- 
ecule at sea level, u (0), can be chosen 
arbitrarily (see Section 9.2). Then 
u (h) = u (0) + mgh , whence mgh = 
u (h) — u (0). Formula (11.4.2) can 
be rewritten thus: 

n (h) n (0) e-[*W-u(0)]/*r 

This is the law of distribution in altitude 
of the number of molecules. We can 
write it like this: 

n(h) = Be- u (V/ kT , 


where B is a constant defined by the 
value of density at sea level (h = 0), 

n(0)=Be- u (°V kT . 

A remarkable fact is that the density 
of molecules at a certain altitude is 
only dependent on the potential energy 
of the molecules at the given site: the 
mass m of a molecule, the acceleration g 
of gravity, and the altitude h entered 
into formula (11.4.2) in exactly the 
same combination {mgh) as they entered 
into the expression for the potential 
energy u . 

Let us find the mean value of the 
potential energy of a molecule, or 

u = mgh — mgh — mgH 

(see formula (11.3.5)). Using formu- 
la (11.3.4), we get 

u = mgH = mg = kT. 

Thus, the mean potential energy of one 
molecule is kT. 

We have established that the distrib- 
ution of air molecules in the atmosphere 
depends on the temperature and on 
the potential energy of the molecules. 
But for a given mean potential energy u 
equal to kT different molecules have 
different potential energies. In other 
words, to a given value of u there cor- 
responds a definite distribution of mole- 
cules in potential energy. Part of the 
molecules, those below altitude H, 
have a potential energy less than kT. 
Let us find the ratio of the number of 
such molecules to the total number of 
molecules. This ratio is 

H oo 

j ndhj j ndh 
o o 

H CO 

= n 0 ^ e~ mgh l hT dh j n 0 j e~ mgh l hT dh. 

o o 

Let us evaluate these integrals: 

H 

e -mgh/kT dh — kT e ~ mgh i hT 

rr> £r 


H 

0 
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= T5-(‘ --j£ (1- -«-*). 

oo 

f g-mgh/kT — kT Q-mSh/hT |°° — !^L 

J mg lO mg # 

0 

Therefore 

J n dhj J n dk = (1 e~ l )j 

0 0 
= 1 — e~ l 0.63. 


10 3 kg/m 3 ) displaced by the particle is 
half the mass of the particle. Taking this 
fact into account, we can say that the 
force of gravity “acts” only on half of 
the particle’s mass, or approximately 
6.5 x 10 -17 kg. At room temperature, 
T = 17 °C = 290 K, such particles arc 
distributed in height (by formula 
(11.4.2)) in accordance with the law 



6.5 X 10~ 17 X 9.8 ,\ 

290 X 1.38 X 10~ 23 / ’ 


To summarize, then, 63% of all the 
molecules have a potential energy less 
than the mean value, while 37% have 
a potential energy exceeding the mean 
value. It is now easy to calculate that 
14% of all molecules have a poten- 
tial energy exceeding 2 kT, 5% of all 
molecules exceeding 3 kT, and so on. 
Generally speaking, the portion of mo- 
lecules whose potential energy is great- 
er than a given value u equals e~ u ! hT . 
This conclusion is very general and 
refers to the kinetic energy of molecules, 
too, which we will study in the section 
below. 


11.5 The Brownian Movement 
and Kinetic-Energy Distribution 
of Molecules 

Over a hundred years ago, an English 
botanist Robert Brown observed the 
random movement of microscopic par- 
ticles suspended in water or other liq- 
uid. Einstein advanced the idea that this 
movement of particles is due to their 
thermal agitation. From this the con- 
clusion was drawn that, for one thing, 
the particles would not all lie on the 
bottom of a vessel but would be distrib- 
uted in height by the same law as the 
distribution of molecules. 

If a suspended particle has the shape 
of a sphere of diameter d = 5 x 10 ~ 7 m, 
then its volume is nd 3 / 6 6.5 x 

10 - 2 ° m 3 ? an( j with a density p = 

2 x 10 3 kg/m 3 the mass of a particle 
is close to 1.3 x 10" 16 kg. We must also 
take into account the Archimedean prin- 
ciple. The mass of the water (density 


or n = n 0 exp (—1.6 x 10 5 h). Thus, 
the number of particles per unit volume- 
decreases by a factor of e when the 
altitude is increased by (1.6 x 10 5 ) -1 
m ~ 0.62 x 10~ 5 m. 

By observing the distribution, in 
altitude, of particles of known size and 
density, it is possible to obtain the 
Boltzmann constant k. On the other 
hand, the ideal gas law yields the mag- 
nitude of i? = kA, after which we can 
find Avogadro’s number. This work was 
carried out by Einstein and Perrin in 
1903-1907 and served as a crucial exper 
imental corroboration of the entire 
atomic-molecular theory and played a 
tremendous part in the development of 
physics. 

There is a constant transformation 
of energy taking place when molecules 
move under the force of gravity: if a mo- 
lecule is moving downward at a given 
time, potential energy is being convert- 
ed into kinetic energy, while if a mole- 
cule is in motion upward, kinetic 
energy is being converted into potential 
energy. When a gas is in a state of 
equilibrium, that is, the pressure of 
the gas is balanced by gravity, tho 
gas molecules are actually moving at 
random with high speeds. However, if 
we picture to ourselves a horizontal plane 
in the gas, the number of molecules 
passing through it in unit time upward 
is equal to the number of molecules 
passing through the plane in the down- 
ward direction, so that, on the average, 
the gas is at rest. In the equilibrium 
state, the transition of kinetic energy 
into potential energy and the transition 
of potential energy into kinetic energy 
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(a) (b) (c) 


Figure 11.5.1 

balance, since the number of molecules 
moving up equals the number moving 
down. 

It is to be noted that in random mo- 
tion the individual (identical) molecules 
have different velocities, or different 
kinetic energies. This is true because 
if two balls having identical speeds col- 
lide at an angle, the velocities of the 
balls after the collision may differ. 
Figure 11.5.1 illustrates a collision 
after which one of the balls (on the left) 
is brought to a halt, while the other 
one, moving upward, shoots off with 
double energy (Figure 11.5.1a depicts 
the balls prior to collision, Fig- 
ure 11.5.16 in collision, and Fig- 
ure 11.5.1c after collision). Note the 
positions of the balls at the instant of 
collision: if the second ball were, at 
collision, located below the first one, 
then it would stop and give up all its 
energy to the first ball. 

Since in molecular motion there is 
a mutual conversion of kinetic and 
potential energy, it is natural to suppose 
that the distribution of the molecules 
as to kinetic energy is similar to that 
with respect to potential energy. 

We give without proof the results of 
computations carried out at the end of 
the 19th century by Maxwell and Boltz- 
mann. The number of molecules having 
velocity components along the x axis 
hrtcwvmrc/ x ~dirfi 'o x V &u x , uVung^ue y a'x’is 
between v y and u y + dv y , and along the z 
axis between v z and v z + dv z is equal to 


where n 0 is the total number of mole- 
cules, and m is the mass of one molecule. 
Observe that v% + v y + v\ = v 2 , where 
v is the speed of a molecule. There- 
fore, (11.5.1) has, in the exponent, the 
quantity (j mv 2 /2)/kT , which is the ra- 
tio of kinetic energy to potential ener- 
gy. The mean kinetic energy calculated 
on the basis of (11.5.1) turned out to 
equal (3/2)kT. For the number of 
molecules n whose kinetic energy ex- 
ceeds the given value .Ewe have a rather 
unwieldy relationship. True, this com- 
plicated relationship can approximate- 
ly be described by the simple formula 

n~n 0 e~ E / kT . (11.5.2) 

This law yields an incorrect value for 
the mean kinetic energy of the mole- 
cules: 

oo oo 

E kl n=~ j ndE= j e~ E l hT dE = kT 

0 0 

instead of (3/2) kT. It gives perceptible 
departures from the true value if E 
is of the order of kT. However, when 
E^> kT , the divergence between the 
exact and the approximate law is not 
essential. 

It will be noted that for the same 
temperature, molecules with different 
masses have the same mean kinetic 
energies and have the same distribution 
as to the magnitude of kinetic energy, 
since the mean velocity of a molecule 
is proportional to 1/]/ m, where m is 
the mass of a molecule. 

Considering the collisions of mole- 
cules against the walls of the containing 
vessel, we can find the gas pressure to be 

P ~ 

Putting E k]n = (3/2) kT, we get the 
ideal gas law 


dn = 


(2nkT/m) 3 l 2 


exp 


m ( v x + v l + v l) 

2 kT 


J dv x dv y dv z , 

(11.5.1) 


p — n 0 kT. 


Mutual collisions of molecules give 
rise not only to an exchange of kinetic 
energy between molecules but also 
to a conversion of the kinetic energy 



11.6 Rates of Chemical Reactions 


395 


of motion of the molecules into the 
energy of rotation of a molecule and in- 
to the energy of vibrations of the atoms 
of the molecule, which is to say, into 
the internal energy of the molecule. 
The converse is also possible: in a col- 
lision, part of the internal energy of 
molecules is transformed into kinetic 
energy. It is therefore natural that the 
distribution of molecules as to their 
internal energy W also obeys the law of 
proportionality to the quantity e~ w/kT . 
The fact that the number of particles 
with a given energy is an exponential 
function of the energy is a universal 
law of nature. 

11.6 Rates of Chemical Reactions 

Of what use is the law of distribution of 
molecules as to kinetic energy? Such 
important characteristics of a gas as the 
pressure it exerts on the walls of the 
containing vessel, its heat capacity, 
and the total reserve of energy in the 
'vokiriire xfi J ?ne gas are iidhned 'ny mean 
quantities, which is to say, they are 
defined by the bulk of the molecules 
whose energy is close to the mean value. 
For example, why do we have to know 
that a minute portion (of the order of 
0.00001%) of the molecules have kinet- 
ic energy exceeding 17 kT ? These sepa- 
rate molecules with very large energies 
have practically no perceptible effect 
on the pressure and the general supply 
of energy of the gas. 

However, the picture changes drasti- 
cally if we consider chemical reactions. 
It turns out that precisely these rare 
molecules with high energy completely 
determine the course of chemical reac- 
tions. The mystery of chemical reac- 
tions stems from the fact that molecules 
entering into a reaction collide every 
10~ 10 second, whereas a reaction fre- 
quently requires several minutes (some- 
times hours). Which means that only 
an extremely small portion of all colli- 
sions result in a chemical reaction. 

The idea was advanced that molecules 
have a certain very small “sensitive 
spot” that must be touched in order for 


a reaction to occur. This is reminiscent 
of the Greek hero Achilles who was 
vulnerable only in the heel. 

A proper explanation was finally 
given at the end of the 19th century by 
the Swedish scientist Svant e Arrhenius. 
It is this: reactions are initiated only 
by collisions of molecules whose energy 
exceeds a definite value, the so-called 
activation energy E a . 

For instance, when molecules of hy- 
drogen and iodine collide, they form two 
molecules of hydrogen iodine HI, the 
energy of the colliding molecules must 
exceed 3 x 10 ~ 19 J. Compare this with 
the fact that at 0 °C the magnitude 
of kT is 1.38 x 10 -23 x 273 ~ 3.8 x 
10“ 21 J. This means that at room 
temperature (or at 0 °C, which is of the 
same order of magnitude) only a minute 
fraction of the molecules possess the 
needed energy, a = e ~ v , where v = 
3 x 10 -19 /3.8 x 10 -21 ~ 80, whence 
a ~ 1 / 10 35 . 

We get the reaction time by multi- 
plying trie time between collisions \n 
is of the order of 10 “ 10 s) by the mean 
number of the collisions among which 
there will be one collision involving 
the required energy. This mean number 
of collisions is of the order of 1/a ~ 
10 35 . We obtain the reaction time 
at 0 °C of the order of 10 25 s ~ 3 x 
10 17 years. This result accords with 
the fact that at 0 °C the reaction H 2 + 
1 2 = 2HI is practically unobser- 
vable. 

From the reasoning given above it 
follows that, depending on the tem- 
perature, the reaction time is expressed 
by the formula 

t = xe E *' hT , 

where t is the time between two colli- 
sions, and E a is the activation energy. 
This formula gives a true picture of the 
dependence of the rate of chemical 
reactions on the temperature. A charac- 
teristic feature of this formula is the 
extremely sharp decrease in reaction 
time and increase in reaction rate for 
slight variations in temperature. 
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However, it frequently happens that 
chemical reactions are much more in- 
volved because they may proceed via di- 
verse intermediate stages. By way of an 
illustration, let us examine the reaction 

H 2 + Cl 2 - 2HC1. 

This reaction proceeds not via collisions 
of molecules of hydrogen and a molecule 
of chlorine but by the scheme 

H 2 + Cl 2 - H 2 + Cl + Cl, 

Cl + H 2 = HC1 + H, 

H + Cl 2 = HC1 + Cl. 

As a result the actually observed reac- 
tion rate involves complicated rela- 
tionships. However, for each separate 
reaction, say for 

Cl + H 2 = HC1 + H (in the reac- 
tion H 2 + Cl 2 = 2HC1), 

the Arrhenius law holds true, and the 
reaction rate is proportional to e~ E * lkT , 
with the activation energy E a having 
different values for each reaction. 

The Soviet scientist Academician 
N.N. Semyonov made a thorough in- 
vestigation of complex (chain) chemical 
reactions and elucidated the laws govern- 
ing the course of such reactions and 
the general causes that lead to such com- 
plicated reaction schemes. 

11.7 Evaporation. The Emission 
Current of a Cathode 


cule, this comes out to 116 Q = 18 x 
2260/6 x 10 23 - 7 x 10- 20 J. But 
at T — 0 °C = 273 K we have kT 
273 x 1.38 x 10~ 23 J ~ 3.8 x 
10~ 21 J, whence QlkT ~ 20. Only 
those molecules can tear away from the 
surface and evaporate whose energy 
exceeds the evaporation heat Q. The 
fraction of such molecules is equal to 
e -Q/ hT . Therefore, the rate of evapora- 
tion is also proportional to e ~^ hT . For 
computational convenience it is com- 
mon practice to multiply the numera- 
tor and the denominator of the expres- 
sion QlkT by Avogadro’s number A: 

Q __ QA QA 

kT ~ kAT RT * 

The quantity QA is the evaporation 
heat of 6 x 10 23 molecules, which is to 
say the evaporation heat of one mole of 
substance. The quantity kA = R is the 
molar gas constant: R ~ 8.3 J/mol-K 
(see Section 11.2). The evaporation heat 
of one mole of water is equal to Q m ~ 
18 x 2260 ~ 40 000 J/mol. Thus, 
the rate of the evaporation of water is 
proportional to e -40000/8 * 3r ~ ^-sooo/r. 

Let us consider the saturated vapor 
above a water surface. If the vapor is 
saturated, the number of molecules of 
water evaporating per unit time is equal 
to the number of molecules in the 
vapor and adhering to the surface of 
the water (condensing) in unit time. 
The rate of evaporation is Ce~Q™l RT 7 


The idea of Svante Arrhenius concern- where C is a constant proportional to 
ing the role of a small number of the area of the water surface. The rate 
molecules whose energy greatly exceeds condensation is proportional to the 
the mean value of energy is helpful pressure of water vapor and to the sur- 
in analyzing not only chemical reactions ^ ace area * Hence, in the case of saturat- 
but also a series of other phenomena vapor, when the rates of evaporation 
including the evaporation of a liquid. an( ^ condensation are equal, 

Evaporation requires the expenditure _ 

* bi‘ a^bonsraeradie Amount oi ^nerg$: °x/p = ce ^ , 

For example, the evaporation of 1 gram 

of w T ater at 100°C requires the consump- where D and C are quantities propo 
tion of about 2260 J. 11 - 5 Per mole- tional to the surface area and on 

slightly dependent on the temperatu 

11 • 5 The evaporation heat is but slightly 
dependent on temperature; for water Q = 

2260 J/g at 100 °Cand 2500 J/gat0°C. We 11 • 6 Water has a molecular weight 
henceforth disregard this dependence. 18 and Avogadro’s number is equal to 6 X 10 
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and totally independent of the pressure, 
-whence 

p = Fe~ Q m /Br , 

where the constant F does not depend 
on the surface area of the water. Thus 
a relationship is established between 
the pressure of saturated vapor and 
evaporation heat. 

Let us consider yet another process 
similar to evaporation, that of the 
emission of electrons from a heated sur- 
face. This process occurs on the cathode 
of an electron tube. A cold cathode in 
vacuum does not emit electrons. 11 * 7 
But at high temperatures the cathode 
does emit electrons. Then if the anode 
(also called plate) has a sufficiently 
high positive potential, it will attract 
the electrons and each electron torn out 
of the surface of the cathode will fall 
onto the anode. The electric current 
flowing in a circuit through an electron 
tube is equal to the product of the 
number of electrons released by the 
cathode in unit time into the magni- 
tude of the charge of a single electron. 

Experiments show that in these con- 
ditions the following relationship exists 
between current j and temperature T: 

j — ge~Q/ hT . 

The value of Q differs for different 
cathodes. For instance, for a cathode 
made of pure tungsten, Q/k = 55 000 °C 
while for barium oxide, Qlk = 
30 000 °C and, hence, such a cathode 
can operate at lower temperatures. 
Using the dependence of / on T 7 , we can 
determine Q/k . Here the quantity Q , 
which enters into the latter formula, 
coincides with the energy necessary for 
tearing an electron out of the cathode 


11 • 7 Here we do * not consider the case of 
a very strong electric field (10 6 V/cm and more) 
capable of tearnig electrons even out of a cold 
cathode. Neither do we discuss the ejection 
of electrons from a cathode by the action 
of light or bombardment of a cathode by 
electrons, ions, or other particles. 


(this electron-ejection energy can be 
determined by other methods). 

An electron tube offers a marvelous 
method for measuring the distribution 
of electrons leaving the surface of a cath- 
ode in accordance with their speeds at 
a given temperature. When the cathode 
is heated, we will apply a small negative 
potential cp to the anode. With this 
potential, the anode will repulse the 
electrons ejected by the cathode. For 
this reason, a large portion of the elec- 
trons will not reach the anode and will 
fall back onto the cathode. However, 
there will be some electrons that will 
reach the anode over the repulsive force. 
For this to occur, the kinetic energy of 
the electron ejected from the cathode 
must exceed the difference in potential 
energy of the anode and cathode, that 
is the quantity ecp, where the Greek 
letter e (epsilon) stands for the electron 
charge (unfortunately, the letter e 
stands here for the base of the natural 
logarithms). The fraction of such elec- 
trons is equal to e~ s ^l hT . Thus, for 
a negative potenial cp across the anode, 
the current is equal to j = / 0 e~ s< P/ feT , 
where j 0 is the current for a positive 
potential. In this experiment, it is 
necessary that the distance between the 
cathode and anode be small so that the 
number of electrons between them 
should not be great and the mutual 
repulsion of electrons should not affect 
the result of the experiment. 

The Soviet scientist Academician 
A.F. Ioffe proposed using this phenom- 
enon for direct conversion of thermal 
energy into electric energy. If electrons 
go from the cathode to a negatively 
charged anode, such a system is a source 
of voltage: the current in an external 
circuit between the negatively charged 
anode and the positive cathode is in 
a direction such that it performs work. 
This method of obtaining electric cur- 
rent is remarkable in that there are 
no moving parts and the circuit is fun- 
damentally simple. In this respect it 
resembles the generation of electric 
power by means of thermoelectric cells, 
which was also proposed by Academi- 
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cian Ioffe. At present semiconductors, 
which contain free electrons already at 
room temperature, have successfully 
replaced electron tubes in radio and TV 
sets and in electronic computers. But 


the heated cathode, which emits elec- 
trons according to the above-described 
law, is still widely used in cathode- 
ray tubes. 



Chapter 12 Absorption and Emission of Light. 
Lasers 


12.1 Absorption of Light: 

Statement of the Problem and 
a Rough Estimate 

Let us consider the absorption of light 
in air containing black particles of 
soot. Suppose a unit volume contains N 
particles. The area of a section of one 
particle by a plane perpendicular to the 
ray of light is denoted by cr. For short, 
we call a the cross section . For example, 
for a particle in the shape of a sphere of 
radius r, cr is the area of a cross section 
passing through the center of the sphere, 
or cr = nr 2 . 12 1 

We will assume that the light inci- 
dent on the surface of the particle of 
soot is completely absorbed. The prob- 
lem consists in determining the portion 
of absorbed light and the portion of 
transmitted light as a function of the 
quantities N, o', and the path length x 
that a light ray traverses through air 
containing the soot. 

We begin with the roughest kind of 
estimate of the distance over which an 
appreciable portion of light is absorbed. 
We denote this distance by L. Just 
what the pregnant expression “appreci- 
able portion of light” means will be ex- 
amined later on in the sections that fol- 
low. Let us not be upset by the clumsy 
statement of the problem, since such 
a situation is quite common in physical 
problems where time and again refine- 
ment of a problem is postponed and is 
carried out in the process of solution. 

Consider a cylinder with base area S 
and height L . We require that the sum 


12 - 1 An exact definition of the cross section 
o for a particle of intricate shape is this: 
a is the mean ( average ) area of the shadow 
cast by a particle on a surface perpendicular 
to a ray of light. The mean area is determined 
from measurements of the shadow area at 
different orientations of the soot particles in 
relation to the light ray, and the plane on 
which the shadow from the particle falls is 
assumed to be perpendicular to the ray in 
all measurements. 


of the cross sections of all particles in 
this cylinder be equal to S. 

What is the physical meaning of the 
condition thus posed? If it were possi- 
ble to arrange the particles so that the 
areas covered by various particles do 
not overlap, then using the particles 
in the cylinder of height L and base 
area S it would be possible to cover the 
whole area of the cylinder and achieve 
a complete absorption of all the light. 
For x < L, total absorption of the 
light is clearly impossible: no matter 
how the particles of soot are placed, 
the total area of their cross sections 
does not suffice to cover the whole base 
of the cylinder. In other words, there is 
not enough soot to ensure total absorp- 
tion of light. 

It is clear that at x = L or even for 
x > L there will not really be com- 
plete absorption. For a random arrange- 
ment of soot particles and for arbitrary x , 
there will remain certain directions 
along which there will not be a single 
particle in the path of the light, which 
will then pass through. 

In the volume SL of the cylinder 
there are NSL particles, the sum of the 
cross sections of which is aNSL , and 
so we require that aNSL = £, or 

L = 1/oN. (12.1.1) 

Let us verify the dimensions in 
(12.1.1): a is the area, so its dimensions 
are m 2 , and N is the number of particles 
per unit volume and has the dimensions 
of m“ 3 . Consequently [L] = m" 2 x 
m +3 = m, as required. 

The energy transmitted through an 
area in 1 second is called radiant flux. 
Let I be the radiant flux through an area 
of 1 square meter. This is called the ener- 
gy flux , or irradiance , and its dimen- 
sions are J -m^s’ 1 , or W/m 2 . Below we 
consider the energy flux of light I (x) 
as a function of the thickness x of a 
layer. Clearly, 

/ (x) = I 0 f (*), (12.1.2) 
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where / 0 is the energy flux of the in- 
cident light, and / (x) is the desired 
function that characterizes the attenu- 
ation of the light. 

What can we say about the proper- 
ties of the function / (#)? If x = 0, 
there is no attenuation of light, I (0) = 
/ 0 , and so / (0) = 1. If x > 0, then 
the light is attenuated, I {x) <C / 0 , 
and therefore / (x) < 1. Clearly, / (x) 
•decreases with increasing x and ap- 
proaches zero, that is, / {x) is a decreasing 
function.-.. Thu 7 i eriv a live „ is -.nega- 

tive, dfldx < 0. 

We have already said that there will 
not be complete absorption either for 
x = L or for x > L, and so we do not 
•expect / (x) to vanish when x = L. 
However, we may assume that the value 
x = L is a characteristic length. This 
means that when light is transmitted 
over a path x L, the fraction of 
absorbed light is extremely small when 
compared with the fraction of trans- 
mitted light. Over a path x ~ L a per- 
ceptible portion of the light is absorbed, 
and over a path x^> L , most of the 
light is absorbed and only a very small 
portion is transmitted. 

As may be seen from formula (12.1.2), 
the function / {x) is dimensionless. We 
can assume that if a dimensionless vari- 
able x!L is introduced, then the func- 
tion / (xlL) will always be the same for 
any kind of particles, for any N and cr. 
These suppositions will be corroborated 
and made precise in the sections that 
follow. 


12.2 The Absorption Equation 
and Its Solution 

We conduct all calculations for a col- 
umn of air in the form of a cylinder with 
base area 1 m 2 (in the preceding section, 
when we considered a cylinder with base 
area S m 2 , the quantity S canceled 
out anyway in (12.1.1)). We consider 
a thin layer of air between x and x + dx 
in the cylinder. A beam of light consists 
of parallel rays and is characterized by 
the energy flux /. If no light were ab- 


sorbed by the soot particles, / would be 
constant. 

The layer under consideration con- 
tains N dx particles covering an area of 
oN dx of the total area of 1 m 2 of the base 
of the layer (we assume that the sha- 
dows from these particle do not overlap, 
which is quite reasonable if dx is small). 
Hence, the layer absorbs a fraction 
oN dx of the energy incident on the lay- 
er: dQ = INodx . When light passes 
through the layer dx , the radiant flux is 
aitennaied^b^ an„aniouu.t.ftcja 1 iaLt.o v t.bn^ 
quantity of absorbed energy dQ . Prior 
to entry into the layer, the energy 
flux was I (x ) , after emergency from the 
layer, it became / (x + dx), and so 

I (x) — I (x + dx) = IaN dx. (12.2.1) 

But I (x + dx) — I (x) ~ dl , whence 
(12.1.1) yields 

iL ^ -I No. (12.2.2) 

The solution of this differential equa- 
tion, as we know, is 

I=I 0 e-oNx (12.2.3) 

(compare with formulas (6.6.10) and 
(6.6.9) and see Chapter 8). Here I 0 = 
I (0) is the value of the energy 
flux at x = 0. 

If the layer thickness is increased in 
arithmetic progression, x x = a, x 2 = 
2a, x s = 3 a, etc., the energy flux 
decreases in geometric progression. In- 
deed, if we introduce the notation a = 
e -oNa (where, of course, 0 <C 
a < 1), we find using (12.2.3), that 
I (x ± ) = I 0 a, I (x 2 ) = / 0 a 2 , I (* 8 ) = 
/ 0 a 3 , etc. 

12.3 The Relationship Between Exact 
and Approximate Absorption 
Calculations 

It will be highly instructive to com- 
pare the exact solution (Section 12.2) 
and the rough estimate (Section 12.1). 
Such a comparison will help us to make 
use of rough estimates in complicated 
problems where an exact solution is 
hard to find. Also such a comparison 
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helps one to understand the range of 
applicability of a rough solution (see 
Section 10.6). 

In the rough solution we found the 
distance over which appreciable absorp- 
tion takes place, L = 1/iVor. With the 
aid of the characteristic length L, the 
exact solution (12.2.3) can be 
expressed as follows: 

/ = / 0 *-*/l (12.3.1) 

Thus the supposition that the quanti- 
ty L, found in a crude chain of reason- 
ing, enters into the exact solution is 
fully corroborated. 

The exact solution is indeed of the 
form 

I = I of (xl L ) , 

where (cf. (12.3.1)) f ( x!L ) — e~ x l L . 

We consider the distance x = L. 
Approximate reasoning gave complete 
absorption of light over this distance. 
Actually, from the exact solution 
(12.3.1), putting x = L, we find that 
I = I^e~ x ~ 0.37 7 0 , which means that 
37% of the light is transmitted through 
a cylinder of length L and, hence, 
the absorption is 63%. For small x!L 
we express e~ x ! L by means of the 
approximate formula e~ u ~ 1 — u (see 
Section 4.8), confining ourselves to 
two terms of a series expansion of e~ u 
(cf. Section 6.1) to get 

e - x /L ~ 1 — xlL. (12.3.2) 

Geometrically, this is tantamount to 
replacing the curve y = e~ x > L by the 
tangent to the curve at point x = 0 
(Figure 12.3.1). As can be seen from 



Figure 12.3.1 
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(12.3.2), the tangent line intersects the x 
axis at x = L. Therefore, if absorption 
were to occur at the same rate, that 
is, so that the same amount of light 
is absorbed on every unit of length 
(precisely, the amount absorbed on the 
initial path), all the light would be 
absorbed over the distance x = L. 

To summarize, then, the quantity L, 
which was obtained via rough consider- 
ations, is indeed of extreme impor- 
tance in the exact solution as well. 

Note once more the importance in 
the practical work of a physicist or 
engineer of knowing how to find and 
apply approximate, rough, solutions, 
and one should take every opportunity 
to develop skill in finding and under- 
standing such solutions. This is far 
more important and fruitful than mali- 
cious snickering over the drawbacks 
of rough solutions. We will be pleased 
that a very rough solution yields 100% 
absorption where the exact solution 
yields 63 % , the error is only by a factor 
of 1.5. The rough solution, for x = L, 
yields 0% light transmission instead 
of the exact value of 37%, that is, the 
approximate approach diminishes the 
transmission of light by a factor of oo, 
but that is not so bad either because 
from the very start it was evident that 
we could not expect good accuracy from 
a rough solution and the 0 % transmis- 
sion of light must be understood only as 
a hint that absorption is considerable. 

If it has been established that a prob- 
lem does not have an exact solution in 
the form of an explicit formula, one 
should not be deterred in the least. 
Seek only for a very rough solution of 
the problem. But when using it, be 
sure to remember that the solution is 
a rough one, an approximate one, and 
by no means an exact one. 

Let us again dwell for a moment on 
the question of dimensions. W e have 
verified the dimensions of L = 1 INo 
and have established that this is length. 
It is often possible to find an approxi- 
mate expression for the quantity that in- 
terests us when all we know are its 
dimensions and the dimensions of the 
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initial quantities given in the state- 
ment of the problem. 12 - 2 In the case at 
hand, however, this is not possible. 
Indeed, a quantity having the dimen- 
sions of length can be constructed by 
proceeding solely from the concentra- 
tion N (m -3 ), namely, Z x = N _1 / 3 . (The 
quantity l x is the mean distance between 
the particles.) A quantity having the 
dimensions of length can also be con- 
structed out of the cross section a(m 2 ), 
namely, l 2 = cx 1/2 . (The quantity l 2 
characterizes the mean particle size.) 
It is then obvious that the quantity 
l a = ZfZ 1-0 * for any value of the expo- 
nent a, also has the dimensions of 
length. For instance, the L that interests 
us is obtained at a = 3. 

Thus, in the problem at hand, dimen- 
sional analysis does not give a definite 
answer. To find L, that is, the quantity 
with dimensions of length entering into 
the exact solution, it is precisely the 
rough solution to the problem that one, 
it turns out, has to find. But even when 
dimensional reasoning yields a unique 
answer, it is also desirable to get a rough 
solution to the problem so as to obtain 
a clearer picture of the phenomenon. 

Finally, we note that the calculations 
carried out in Section 12.2 were based 
on the assumption that the soot parti- 
cles are distributed randomly for a given 
mean concentration N. We assume that 
at the entrance (, x= 0) the energy flux q 0 is 
uniform over the entire cross section. But 
after the light has traveled a certain dis- 
tance x through the air containing the 
particles, there appear, strictly speak- 
ing, certain parts of the cross section 
(of total area S x ) covered by shadows 
from the particles (and there q = 0), 
while other parts remain unshadowed 
(of total area S 2 ) where q = q 0 (here 
’S ± S 2 = S, since the total cross- 

sectional area S remains constant). The 
quantity we are calculating here is the 
energy flux averaged over the entire 


12 • 2 This constitutes the [method of dimen- 
sions (dimensional analysis). The approach 
has wide application in physics, mechanics, 
and engineering. 


cross-sectional area : 


q(x) = 


2 

Si + s 2 


9qS 2 

5 • 


In calculating the variation of q over 
the length it is essential to assume 
that the particles occurring within the 
layer between x and x + dx are distrib 
uted in this layer randomly , that is, 
do not prefer illuminated or shadowed 
areas. 

The reader is advised to investigate 
the situation in which the particles 
form an ordered array in the air like 
atoms in a crystal and how this in- 
fluences q (x). 


12.4 The Effective Cross Section 

In the problem of attenuation of light 
passing through dusty air, the quanti- 
ty o* has a simple geometric meaning 
of the area of the shadow cast by a sin- 
gle dust particle. The law of attenuation 
of light (12.2.2) is the same for light of 
different wavelengths (that is, differ 
ent colors), since a is independent of 
the wavelength. 

Contrary to the above situation, in 
the absorption of light by separate mo- 
lecules and atoms there is observed 
a strong dependence of the law of 
attenuation of light upon the wavelength 
of the light. For example, in clean 
air at atmospheric pressure, visible 
light is hardly at all attenuated (atten- 
uation is less than 1% per kilometer 
of path length; accordingly, the atten- 
uation is by a factor of e over a distance 
of about 100 km). Ultraviolet rays 
of wavelength 1.8 x 10 -7 m = 1800 A 
(A stands for Angstrom, 1 A = 10 ~ 10 m) 
are attenuated by a factor of e over 
a distance of L = 0.001 m = 0.1 cm. 
Still shorter ultraviolet rays of wave- 
length 1.1 x 10" 5 cm == 1100 A are 
attenuated e times over a path length 
of L = 0.01 cm. 

Consequently, the absorption of 
light by air is not like the absorption of 
light by a black dust particle, which 
absorbs light of any wavelength to 
the same degree. 



12.5 Attenuation of a Charged-Particle Flux of Alpha and Beta Rays 


m 


ftc 1 °h^U 1 D£>4t A K IJ f y ° « o mrj- 

gle atom in unit time, q, is proportional 
to the energy flux I of the light at the 
point where the atom is located: 

q = al. (12.4.1) 

Here cf is the constant of proportionali- 
ty. Let us determine the dimensions of a. 
The dimensions of q are J/s. The dimen- 
sions of I are J/m 2 -s. Hence, the di- 
mensions of o' are m 2 . The quantity a 
is called the effective cross section. For 
a black dust particle, the constant of 
proportionality coincides with the geo- 
metric area of the shadow. For mole- 
cules and atoms, a is strongly dependent 
on the wavelength of the light. 

In rough fashion, we can picture the 
cause of this dependence as follows. 
The amount of energy absorbed by an 
atom when acted upon by light proves 
to be particularly great when the 
frequency of the light oscillations coin- 
cides with the frequency of motion of 
the electrons in the atom. This is 
resonance : the electron oscillates in- 
tensively and absorbs a particularly 
large amount of light energy. Such 
a resonance is attained, for instance, 
in the absorption by sodium atoms (in 
the vapor state) o of yellow light of 
wavelength 5890 A = 5.89 x 10 -7 m. 
The very same yellow light is emitted 
by sodium atoms at higher tempera- 
tures, when electron oscillations are 
caused by energetic collisions of atoms 
among themselves. 

At resonance, a reaches 10~ 14 m 2 , 
that is, is of the order of magnitude of 
the square of the wavelength. Atoms 
and molecules are of size 10~ 10 to 
10 ~ 9 m, which corresponds to a cross 
section of the order of 10 " 20 to 10 ~ 18 m 2 . 
Thus, the maximum effective cross 
sections are many times greater than 
the true cross-sectional areas of atoms 
and molecules. On the other hand, for the 
light, whose frequency does not corre- 
spond to the natural frequency of the 
atom, the effective cross section is 
small, much less than the cross-sec- 
tional area of the atom. 


W 9 ^ n nt/ic , naUxVj , + hnk i ° ^ infe br y 
rarefied gas an atom excited by light, 
as a rule, re-emits the light in an arbit- 
rary direction, while in a dense gas the 
excited atom transmits the energy to 
other atoms in collisions. 

Thus, in a rarefied gas we have light 
scattering while in a dense gas we have 
true absorption. However, the “fate” of 
the energy taken away from the atom is 
unimportant in calculations of the de- 
crease in intensity of a directed light ray. 


12.5 Attenuation of a Charged-Particle 
Flux of Alpha and Beta Rays 

The exponential law of diminution of 
particle flux as a function of distance, 

/ = /^-*/l (12.5.1) 

is based on a very general assumption 
that the attenuation of the flux over 
a small distance dx is proportional to 
the intensity of the flux: 

-S-" —r 1 ’ (12.5.2) 

where the constant of proportionality 
1 /j L is dependent solely on the type of 
particle. Since (12.5.1) is the solution 
to the differential equation (12.5.2), it 
is obvious that formulas (12.5.1) and 
(12.5.2) are equivalent. 

Experiments have shown that in 
certain cases the exponential law (12.5.1) 
is quite exact, but sometimes devia- 
tions from the law are observed. Let us 
consider carefully the reasons that can 
give rise to deviations from formula 
(12.5.1) or (which is the same thing) 
from (12.5.2). 

It is easy to answer the question 
about the meaning of deviations from for- 
mula (12.5.2). Formula (12.5.2) pre- 
sumes that when x and I vary, the light 
(or other radiation) under consideration 
does not vary qualitatively, otherwise 
the number L would change. We re- 
write Eq. (12.5.2) as 

i_dl__ 1_ 

I dx L 


26 * 
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From this we see that the quantity 
7 -1 (dlldx) is constant. If it turns out 
that at different points in space this 
quantity is different, this means that at 
such points not only is the intensity of 
radiation different but also its physical 
characteristics (say a different color 
light, that is, having a different mean 
wavelength). 

When considering problems of pro- 
tection against radioactive radiation 
and questions of the passage of a-, P-, 
and y-rays and neutrons through vari- 
ous substances, there is a different reason 
for departure from the simple law 
<12.5.2). 

As applied to the process of light 
absorption, the law (12.5.2) signifies 
the following: if the light encounters 
a dust particle, some passes by the par- 
ticle without any change while the 
other portion of light is completely 
absorbed by the dust particle. The 
situation is more complicated in the 
case of radioactive radiations: an a-par- 
ticle is a nucleus of the helium atom 
flying out of the radioactive parent 
nucleus at high speed (of the order of 
€.07 c, where c is the speed of light, 
that is, at a speed of about 2 x 10 7 m/s). 
In passing through an atom, the a- 
particle gives up a small part of its 
energy to the electrons. After roughly 
50 000 collisions with atoms, the a- 
particle will have lost half of its energy. 
It will not cease to exist, but its energy 
and speed will have changed. After 
100 000 collisions the a-particle comes 
to a halt, ceases to collide with atoms 
and to knock out electrons. The stopped 
particle removes two electrons from 
the surrounding atoms and becomes 
a neutral helium atom at rest (with the 
exception of thermal motion). When 
Rutherford found that incidence of an 
a-particle into a vessel with gas is 
accompanied by helium buildup in the 
vessel, this fact signified the end of 
a very important stage in radioactive 
studies. 

The number of collisions necessary 
to stop an a-particle is such that 
it takes only several centimeters of air 



Figure 12.5.1 

to do so. Actually, different a-particles 
(having the same initial energy) expe- 
rience different numbers of collisions— 
they are not necessarily equal exactly 
to 100 000. However, since 100 000 is 
a big number, over a given path length 
the departures of the number of colli- 
sions of separate a-particles from the 
mean (100 000) are but slight (of the 
order of 300, which is about 0.3% of 
the total number of collisions). 12 - 3 For 
this reason, a-particles of the same 
energy always lose all their energy 
over roughly the same distance. This 
path length depends on the initial 
energy of the a-particle. If a flux of 
a-particles of the same energy is di- 
rected along the x axis, the relationship 
between the intensity of the flux and 
the path length x is shown in Figu- 
re 12.5.1. The curve is quite different 
from the graph of the exponential func- 
tion. Over a considerable portion of 
the path length, the intensity of the 
flux of particles does not change: the 
same number of a-particles fly through 
an area of 1 cm 2 in the same intervals 
of time. Then the intensity falls off 
sharply. This drastic fall was prepared 
over the section where the intensity 
remained constant, because over this 
portion the enegy of the a-particles 
diminished with increasing path length. 
The sharp drop in the flux intensity 


12 • 3 Here the law of large numbers operates. 
This law deals with a multiple repetition of 
random events of the same type, the events 
being such that their outcome cannot be pre- 
dicted. This law states that in a large number 
of trials the fraction of cases with a definite 
outcome differs very little from the expected 
(mean) number of such outcomes (this fraction 
necessarily tends to the mean number as the 
number N of repetitions of the event tends to 
infinity). 
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Figure 12.5.2 

occurs where the energy of the a-parti- 
cles becomes extremely small. 

The picture is similar in the case of 
fast electrons ((3-particles emitted when 
a neutron is converted into a proton in 
the nucleus of an atom). Here the situ- 
ation is complicated by the fact that 
in radioactive decay electrons with 
different velocities are emitted; what 
is more, the electrons give up part of 
their energy to the atom near which 
they fly and also experience a consid- 
erable lateral deviation. 

The curve for the flux intensity of 
p-particles as a function of path length 
is of the shape shown in Figure 12.5.2. 
We see that it differs greatly from the 
one in Figure 12.5.1. Even for small 
x's some of the electrons fall out of the 
beam. These are mostly electrons which 
had low initial velocities. Therefore, 
near x = 0 the behavior of the curve is 
similar to that of the exponential func- 
tion. Further on, however, the curve 
reaches the x axis, the intensity I be- 
comes zero for a very definite value of x 
corresponding to the maximum energy 
of the electrons generated in the given 
type of radioactive decay. 

The most important practical prob- 
lems (they are also the most difficult 
ones) are those connected with protec- 
tion against y-rays (gamma rays) emit- 
ted by radioactive substances and against 
neutrons produced in the fission of 
nuclei in nuclear reactors and nuclear 
explosions. The situation here is con- 
fused and complicated in the extreme 
by the fact that y-rays and neutrons 
give up energy in large portions and 
are strongly deflected from their orig- 
inal directions in the process. Even in 
a thick layer of air (100 to 200 meters) 


there is a considerable probability (of 
the order of 37%) of the passage of 
unaltered y-rays and neutrons. This is 
why they require thick shielding. A flux 
of y-rays and neutrons does not become 
zero for a definite thickness of the 
protective layer, as was the case for a- 
and p-particles. As experiments and 
complicated calculations show, for a 
given large thickness of the protective 
layer, a flux of y-rays and neutrons falls 
off in rough accord with the exponential 
law, with the exponent often differing 
from the one corresponding to the initial 
section of the curve. 

12.6* Absorption and Emission 
of Light by a Hot Gas 

Let us imagine a hot gas through which 
a beam of light is passing. We wish to 
study in greater detail the effect that 
light propagation has on the atoms of the 
gas, say, sodium atoms. More precisely, 
we wish to know what happens to 
the atoms when they absorb or emit 
light and when they collide with other 
atoms and molecules, bearing in mind 
that the temperature of the gas is high. 

First we must note that the assump- 
tion that an atom is like a small ball on 
a spring, or an oscillator (a device 
with a definite natural frequency), in 
many cases is not sufficiently accurate. 
An atom is a system consisting of 
a nucleus and electrons that obeys the 
laws of quantum mechanics . These laws 
cannot be discussed here;, suffice it to 
say that a quantum mechanical system 
of this type may be only in certain 
states with definite (or, as the mathe- 
matician would say discrete) ener- 
gy values. The lowest state, which is 
characterized by the lowest energy pos- 
sible for the system, is called the 
ground state ( we will sometimes call it 
the unexcited, or zeroth, state and 
label the quantities referring to this 
state with a subscript “0”: the energy of 
the atom in this state is E 0 , the num- 
ber of atoms in the ground state per 
unit volume is n 0 , and so on). In the 
same way we will speak of the first state 
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(after the ground state) as the first ex- 
cited state and label the respective quan- 
tities with a subscript “1”: the ener- 
gy E 1 , the concentration of atoms n ly 
etc. The second excited state has ener- 
gy E 2 , a concentration of atoms n 2 , and 
so on. We will always assume that 

E 0 E 1 ^ E 2 .... 

At a low temperature all the atoms in 
a system have the lowest possible ener- 
gy, that is, are in the ground state: the 
energy of each atom is E 0 , and the 
concentration n 0 of atoms is equal to 
the concentration n of the entire col- 
lection of atoms (in all possible 
states). 12 - 4 

An atom in its ground state can 
absorb light, going over in the process 
to another state, say, the first excited. 
What does quantum mechanics say 
about this? In the language of quantum 
mechanics, light consists of separate 
particles, called sometimes light quanta 
and sometimes photons , and the energy 
of each such particle (or quantum of 
light or photon) is hv , where h = 
6.63 X 10 -34 J-s is Planck's con- 
stant, and v is the f requency of the light 
(number of oscillations per second). 
A peculiar feature of quantum theory 
is that while speaking of particles of 
light we do not give up such notions as 
the frequency of the light and wave- 
length, which strictly speaking belong 
only to the concept of light as an electro- 
magnetic wave. 

But let us go no deeper into the 
philosophy of quantum mechanics. 

Energy conservation states that when 
an atom absorbs light (a single photon) 
and goes over to its first excited state, 
then 

E 0 + hv = E v (12.6.1) 

The same holds when an excited atom 
emits a photon, going over from the 
first excited state to the ground state. 
This means that the “resonance” fre- 
quency v of the absorbed or emitted light 

12 - 4 Strictly speaking, n = n 0 + n 1 + 
ji 2 + . . but at a low temperature 
U 0 , n 2 etc -> an( l n l + U 2 + • * • 72 0> 

so that n ~ n 0 . 


depends on the difference in the energies 
of the two states of the atom; this 
frequency (which can be measured using 
a spectrometer) characterizes the transi- 
tion from one state of an atom to an- 
other. The quantity v, more precisely, 
v li0 , is not the frequency of electron 
vibrations inside the atom in a defi- 
nite state (zeroth or first excited) — it 
depends on both states. Spectroscopists 
(scientists dealing with the theory and 
interpretation of interactions between 
matter and radiation) noticed this pat- 
tern long ago, since it is evident from 
tables of spectral lines in the spectra of 
various atoms. Indeed, it was found that 

V 4,l + V l,0 — V 4,3 + V 3, 0 ==•••- 

(Why? Verify this.) The validity of 
such equalities can be verified without 
knowing anything about atoms. 

The famous Danish physicist Niels 
Bohr (1885-1962) was the first to under- 
stand the true meaning of, and reason 
for, such a pattern. He formulated the 
following “quantum postulates”: 

(1) there is a set of states (0, 1, 
2, . . .), called stationary states, in 
which the atom does not emit light; 

(2) emission of light occurs during 
the transition of the atom between two 
stationary states, and the frequency of 
the radiation emitted equals the differ- 
ence in the energy of the states divided 
by Planck’s constant. 

Similar statements can be formulated 
for light absorption by atoms. 

Let us now return to the main ques- 
tion of this section — how a hot gas ab- 
sorbs and emits light. The reader will 
recall that at a low temperature all the 
atoms in a gas are in their ground state 
and that light of resonance frequency 
(we will again write v instead of v lt0 ) 
is absorbed, so that 

= - on 0 I, (12.6.2) 

where I is the light flux and a is the 
so-called cross section (see formu- 
la (12.2.2)). What will change if the 
gas is hot? 
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In a hot gas, as a result of the ther- 
mal motion of the atoms and collisions 
of the atoms with atoms and molecules 
of other gases (if we are dealing with 
a mixture of gases), a fraction of the 
atoms will transfer to states with 
higher energies. In other words, we 
will have excited atoms in addition to 
those in the ground state, that is, we 
can no longer ignore %, n 2 , . . ..Accord- 
ing to the general law of thermal mo- 
tion (see Section 11.4) at a given tem- 
perature r, we have 

n 0 n i -f- n 2 -r- . . . = e~ E ^ kT -f- e~ El / kT 

-f- e -E 2 /kT+. 9m9m (12.6.3) 

If the total number of atoms per 
unit volume, tz, is fixed, then, combin- 
ing (12.6.3) and the fact that n = 
+ n 2 + . . ., we get 

e -Eo/kT 

n ° ==n e - E ^ hT + e~ E ^ hT + ... 

= iJre -{Ei-Eo)/kT^_ ~ ’ (12.6.4) 

n€ -(Ei~E 0 )/kT 

ni = l + e-iEi-Eo)/^ _j_ _ ’ 

We see that for a fixed n the absorption 
drops as the temperature rises, since 
n 0 < n. But in addition to this — and 
this is the most important aspect — 
there appear excited atoms in the gas, 
atoms that are capable of emitting light. 
The frequency of the light they emit is 
the same as that of the light they 
absorbed earlier. 

It may be assumed that in the pre- 
sence of excited atoms the equation 
for the intensity of the light is 

? — — onj -f- xn lt ? (12.6.5) 

Here the coefficient x must have the 
meaning of the product of the probabi- 
lity w of an excited atom going over in 
one second to the ground (zeroth) state 
and emitting a photon with the probabil- 
ity a of the emitted photon joining 
the light flux. Since I is expressed in 
units of J/m 2 *s, we must multiply wa 
by the energy hv of one photon. The 


reasoning behind all this is as follows: 
in a layer dx cm thick and a cross- 
sectional area of one square centimeter 
there are n x dx excited atoms, with the 
rest of the reasoning done on dimension- 
al grounds, namely, w has the dimen- 
sions of 1/s = s -1 , a is dimensionless, 
and n x has the dimensions of m -3 , so 
that if x = wahv, the dimensions of 
xi 2 1 are J/m 3 *s, which is the same as 
the dimensions of dlldx . We see that 
on dimensional grounds formula (12.6.5) 
is correct. And yet it is not correct! 

The question marks on both sides of 
Eq. (12.6.5) are to remind the reader 
that this formula is incorrect. We will 
see how this formula must be modi- 
fied when we consider the thermody- 
namic equilibrium of light fluxes. 

12.7* Radiation in Thermodynamic 
Equilibrium 

Suppose that the vessel containing 
the gas has ideally reflecting walls. 
The temperature of the gas is fixed and 
is maintained constant. 

It is natural to assume that if ini- 
tially there was no light in the vessel, 
it will appear as a result of emission 
by excited atoms, while if initially the 
intensity of the light was too high, the 
surplus of light will be absorbed by the 
atoms in the ground state. 

It can be expected that, regardless of 
the concrete initial state, a certain 
intensity of light will set in in the ves- 
sel as a result of absorption and emis- 
sion of light by the gas with a fixed tem- 
perature. More refined reasoning shows 
that the intensity of light of a given 
frequency depends only on the tem- 
perature of the gas. Indeed, imagine two 
vessels filled with two different gases 
but having the same temperature. Next 
connect these vessels with a light guide 
containing an optical filter that trans- 
mits light only of a definite frequency 
in both directions. If the light intensi- 
ties were different, then, in spite of 
having the same temperature, one ves- 
sel would give off more energy than it 
receives, while the other would receive 
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more energy than it gives off, so that the 
first vessel would cool off while the 
other would heat up, which is clearly 
impossible. 

Thus, we conclude that there is 
a certain equilibrium intensity of light, 
an intensity at which there is equilib- 
rium between emission and absorption. 
This equilibrium intensity at a given 
frequency depends only on the 
temperature of the gas and not on the 
properties of the gas. It is natural then 
to try to find the equilibrium intensity 
of light starting directly from the gen- 
eral principles of thermal motion. 

Let us consider radiation (or light) of 
a certain frequency and direction in a 
vessel. The energy of this radiation can- 
not assume arbitrary values: according 
to quantum theory (in our case, the 
theory olt photons), 'the energy may he 
zero (not a single photon of a given 
wavelength) or hv (one photon) or 2 hv 
(two photons), and so on. 

From general principles, the ratios 
of the probabilities p 0 , p x , p 2 , • • . of 
the corresponding states are 

Po -T- Pi + p 2 + • • . = 1 -f- e~ hv ' kT 
-L. e -2hv/hT^_ , . (12.7.1) 

in accordance with the fact that the 
energies are 0, hv , 2hv, . . . and the 
probabilities are proportional to e~ E i kT 
(since e° = 1, the sequence on the 
right-hand side of (12.7.1) starts with 
unity). But the sum of all these prob- 
abilities must be equal to unity because 
we have exhausted all the possibili- 
ties — the given light ray may be char- 
acterized by one, two, three, etc., or 
by the absence of photons. This means 
that we can normalize, so to say, the 
distribution (12.7.1) by adding to these 
ratios of probabilities the following: 

Po + Pi + Pi + • • • = 1* (12.7.2) 

which shows that the sum of the 
probabilities is unity. From (12.7.1) it 
clearly follows that 


we employ (12.7.2): 


Po + Pi + P 2 + . . . — a (1 + e- h v/* T 

_f_ e -2hv/kT _j_ _ ^ ) 


1-e 


- hv/kl 


(12.7.4) 

and, hence, 

a—1 — e~ hv i kT (12.7.4a) 


note that inside the parentheses in 

(12.7.4) we have the sum of the terms 
of a geometric progression). 

Now we can find the average number 
of photons cp and the average energy 
(more precisely, the energy density) 
e in a given type of rays: 


Cp — 1 X Pi + 2 X p 2 + . . ., 

(12.7.5) 

a = c qhv (12.7.5a), 


(since cp is the average number of pho- 
tons per unit volume, the dimensions 
of this quantity are m~ 3 , while the di- 
mensions of 8 are J/m 3 ). Substituting 
(12.7.3) and (12.7.4a) into (12.7.5) and 
then the result into (12.7.5a), we get 

e -hv/kT a 

y = 1 _ e -hv/ki = e hv/kT_ ! ’ (12.7.6) 


hv 

v/ftn 


(12.7.6a) 


(here we have left out the simple cal- 
culations and have given the final re- 
sult 12 - 5 ). A given frequency v corre- 
sponds to a certain period 1/v of the 
oscillations and a certain wavelength 
X = c/v of the wave. 

We can now find the number of waves 
in a volume V with frequencies ranging 
from v to v + dv. One must bear in 
mind that electromagnetic waves have 
a certain polarization: the electric field 
in a wave is at right angles to the direc- 
tion of propagation of the wave (in 
three-dimensional space, however, there 
are numerous mutually perpendicu- 
lar directions that are at right angles 


Pn =z ae~ nhv / h2 \ n = 0,1,2, ...(12.7.3) 

(i.e. po = a, p x = ae~ hv i kT , etc.), where 
factor a has yet to be found. To find a, 


12 - 5 To derive (12.7.6) from (12.7.3), 
(12.7.4a), and (12.7.5), it is sufficient to note 
that cp— e ~ hv / hT cp = Pl + p 2 + p z + . . .= 
1-Po (=!-«)• (Why?) 
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to a given direction). We must also 
allow for the conditions of wave reflec- 
tion from the walls of the vessel, but 
it has been found that the number of 
independent types of oscillations with 
frequencies within a given range de- 
pends, to the first approximation, only on 
the volume V and is proportional to 
this volume. Thus, we can speak of 
a certain number dK of oscillations 
within a frequency range dv per unit 
volume; this number proves to be 

dK^ — c ~ , (12.7.7) 

where c is the speed of light. Since the 
frequency v (and, hence, dv) has the 
dimensions of s" 1 and c the dimensions 
of m/s, the dimensions of dK are m -3 . 

A full derivation of (12.7.7) is too 
tedious to include in a book for begin- 
ners, but it is easy to see that (12.7.7) 
is reasonable and can easily be under- 
stood. Note that vie = MX, whence 
v = c/X and | dv | = (c/X 2 ) dX. Then, 
according to (12.7.7), the number of 
oscillations with wavelengths ranging 
from X to X + dX is 

dK = ^-dX. (12.7.7a) 


This means that the number of oscilla- 
tions with wavelengths ranging, say, 
from X 0 to 2X 0 is 


K 


^o> 


2X 0 


2X { 


Q f dX 8n I 1 

8lt i T3-=-r(i^- 

1 ) 


Xo 


In 1 

3 XI * 

(12.7.8) 


If we abstract ourselves from the numer- 
ical factor, which differs but little, in 
order of magnitude, from unity, we can 
say that the number of oscillations 
with wavelengths close to X 0 in a unit 
volume is approximately equal to the 
number of oscillations that fit inside 
a cube with an edge X 0 . On the average, 
each oscillation occupies a volume of XI , 
which constitutes a result that not on- 
ly can easily be remembered but is also 
quite natural. 

Now we can easily write an expres- 
sion for the total energy of radiation per 


unit volume, Q. To this end we take the- 
energy of each oscillation with a defi- 
nite frequency, multiply it by the num- 
ber of oscillations of the given type r 
and add the products (that is, integrate 
them): 



oo 

.( 

0 


hv 

e hx/kT ^ 


8jiv 2 dx 
c 3 

(12.7.9)- 


(' Q has the dimensions of energy density, 
J/m 3 ). It is convenient to introduce 
the notation hv/kT — x and isolate the* 
dimensionless integral thus: 


^=-7?-( J x) 4 J*r=r- (12.7.9a) 


The integral on the right-hand side- 
of (12.7.9a) cannot be expressed in 
terms of elementary functions. Ap- 
proximately, the integral 

(12.7.10). 

0 


can be evaluated if we note that 

1 -X 1 

e x — l c 1 — e-x 

= e~ x (1 + e~ x -f e~ 2x -f- . . .) 

= e~ x -f e~ 2x + . . . , 
and, hence, 

J — j x 3 e~ x dx -f- j x 3 e~ 2X dx -\- .... 

(12. 7. 11)* 

But (see (7.6.19)) 

oo 

j x 3 e~ x dx — 3! = 6, 
o 

oo 

= -|r | y 3<rV dy = jQ'X 6, etc., 
o 

so that 


j x 3 e 2X dx 
o 


1 e x 


dx a f a \ t 

( 1+ i6 



(12.7.11a)* 
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The series on the right-hand side of 
(12.7.11a) is convergent; if we keep 
only five terms, we get 6.482 for the 
approximate value of integral (12.7.10). 

The theory of functions of a complex 
variable (see Section 17.2) yields the 
exact value of (12.7.10), which is n 4 /15 
or approximately 6.494; thus, the error 
introduced by limiting the sum in 
(12.7.11a) to five terms is smaller than 
0 . 2 %. 

Thus, the exact expression for the 
equilibrium radiation density (in J/m 3 ) 
at a given temperature T (in kelvins) is 

^ = t?t£-- 7 - 5x 10-16 74 - ( 12 - 7 - 12 ) 

Suppose that the vessel in which such 
a radiation density has set in has a hole 
in one of its walls with an area of S 
square centimeters through which ener- 
gy flows out. The energy flux will be equal 
to RS, where R is measured in units of 
J/m 2 -s, with 

R = -^-Q~ 5.67 x 10~ 8 T i . (12.7.13) 

Electromagnetic energy “flows,” or 
propagates, with the speed of light c, 
but at each given moment of time only 
a half of the photons are moving toward 
the hole, with the other half moving 
in the opposite direction. In addition 
To this, even the photons that do move 
toward the hole are flying at different 
angles to the wall with the hole, which 
-diminishes the flux by a factor of two. 
Exact allowance for these factors gives 
the coefficient 1/4 in (12.7.13). 

Now let us close the hole in our box 
(vessel with mirror walls) with a black 
material heated to the same temperature 
as that of the box and the radiation 
inside it. When we speak of the material 
as black , this means that any photon 
with any frequency falling on the 
material is absorbed. Our “stopper” 
.should heat up, since it absorbs radi- 
ation. However, since the stopper has 
the same temperature as the radiation, 
it obviously cannot heat up to a higher 
^temperature. Hence, the stopper will 


emit as much radiation as it absorbs: we 
conclude, therefore, that a black body 
with a temperature T emits per unit 
area per unit time 5.67 X 10 -8 T 4 joules 
of energy ( T in kelvins). This quantity 
characterizes what is called blackbody 
radiation. 

The order in which we have discussed 
all these questions is opposite to the 
order in which these concepts developed 
historically. First the law (12.7.12) was 
discovered in experiments, that is, it 
was established that the total energy 
radiated by a black body is proportional 
to the fourth power of the temperature of 
the body (1888; the Stefan-Boltzmann 
law or the fourth - power law or Stefan's 
law of radiation ), with the propor- 
tionality coefficient equal to 5.67 X 
10~ 8 J/m 2 *s* K 4 (the Stefan-Boltzmann 
constant). 

In 1899 formula (12.7.6a), known as 
the Planck radiation formula , was dis- 
covered. Planck's constant h = 6.63 X 
10 -34 J-s for the first time appeared 
in this formula. 

Planck assumed that energy associat- 
ed with electromagnetic radiation is 
emitted and absorbed in discrete amounts 
proportional to the frequency of the 
radiation, with h being the coefficient 
of proportionality, but that quantum 
theory has nothing to do with the 
propagation of the radiation. Later, 
in 1905, Einstein established that light 
consists of certain portions with the 
energy of each portion equaling hv and 
momentum equaling hv/c. Finally, in 
1924, the Indian physicist Satyendra 
Nath Bose (1894-1974) gave a derivation 
of the Planck radiation formula based 
on the calculation of the probabilities 
p o, p u p 2 , . . ., with which we started 
our discussion. 


12.8* Emission Probability and the 
Conditions for Thermodynamic 
Equilibrium 

Let us return to the formula for the 
intensity of a beam of light passing 
through a hot gas. 
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From Eq. (12.6.5) (the reader will 
recall that this equation is incorrect), 

= — onj+yait,? 

and the condition dlldx = 0 we obtain 
the equilibrium intensity 7 eq : 

= ( 12 . 8 . 1 ) 

The ratio njn 0 of the concentration 
of excited atoms, n x , to the concentra- 
tion of atoms in the ground state, n 0 , 
is equal to exp [ — (E 1 — E 0 )/kT] (see 
(12.6.3) or (12.6.4)). According to the 
quantum condition (12.6.1), E t — E 0 = 
,hv, which means that n x /n 0 — e~ hv/hT , 
so that 

? I e q = e ~hv/hT ' ? (12.8.1a) 

Formula (12.8.1a) for I eq is similar 
to the correct formula (12.7.6a) obtained 
by Planck, but does not coincide with it. 

We will start with the similar as- 
pects. The equilibrium radiation den- 
sity J eq does not depend on the con- 
centration of the gas whose atoms 
absorb (in the ground state) and emit 
(in an excited state) radiation; it de- 
pends only on the frequency v of the 
light and the temperature T of the gas. 
In the limit hvIkT 1, the dependence 
of / eq on these two quantities is the 
same as in the Planck radiation for- 
mula. Indeed, if hvIkT 1, then 
• e hv/kT and i n the Planck radiation 

formula (12.7.6a), 

^ hv 

8 “ e hv/k r l __ 1 ’ 

we can ignore the one in the denomi- 
nator in comparison to the large term 
, e hv/kT : f or we have 


energy density e has the dimensions of 
J/m 3 and c has the dimensions of m/s, 
the product cs has the dimensions of 
J/m 2 *s, which are the dimensions of I.) 

Thus, the formula (12.8.1a), which 
was obtained via naive assumptions, 
expresses a dependence of I eq on v and 
T that does not differ significantly from 
(12.8.2). For the similarity to be com- 
plete, we must require that the ratio 
of the energy given off by an excited 
atom to a single wave in the form of 
radiation (this energy is characterized 
by coefficient x) to the cross section of 
absorption of light by the atom in 
the ground state, cr, obeys the following 
expression: 

~:=chv. (12.8.3) 

From the point of view of the theory 
of resonance (and from the point of 
view of quantum mechanics, as well), 
this formula looks quite natural. If the 
absorption cross section is small (e.g., 
because the charge on the vibrating 
ball is small), so is the probability of 
emission (which also depends on the 
charge on the ball). 

Now we wish to go from the probabil- 
ity of emission of a single wave to the 
total probability of energy emission by 
an excited atom. If we divide the 
energy emitted per unit time into 
a single wave by the energy hv of one 
photon, we obtain the probability of 
emission into a single wave. The next 
step is to add all these probabilities, 
that is, to integrate over the frequency 
range: 12 - 6 

x 8jxv 2 dv _ C a8itv 2 dv 
hv c 3 J c 2 

(12.8.4) 


hv 

y 

“ e hv/hT 


hve ~ hv ' ,hT , 


from which it follows that 


I eq ~ chve~ hy? l kT , > 1, (12.8.2) 


;since I is the energy flux, and c is the 
.speed of light. (Note that since the 


(the dimensions of w are 1/s = s _1 ). 
Since w is the emission probability 
(precisely, the probability of emission 
per unit time, or emission rate), we 

12 • 6 It is unimportant that this range 
includes frequencies which the given excited 
atom will not emit, since for such frequencies 
x = 0 and the factor x (v) in the integrand 
will cut off such frequencies. 
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have the following expression for the 
concentration of excited atoms in the 
absence of external radiation: 

^f=-wn i . (12.8.5) 

The solution to this equation is obvi- 
ous: 

n x ( t ) = (t 0 ) exp [ — w (t — £„)] 

( 12 . 8 . 6 ) 

Accordingly, t = w _1 is the mean life- 
time of the excited atom (cf. Sec- 
tion 8.2). 

The reader will note that above 
only the first excited state was con- 
sidered, a state that can decay in only 
one manner, that is, go over to the 
ground (or unexcited) state. As a rule, 
for a given ground state and a fixed 
excited state (with a given spatial 
orientation of the atom), the absorption 
cross section cr depends only on the 
direction of propagation and the polar- 
ization of the incident wave. The 
exact formula in this case assumes the 
form 

W= ( ( ai + o 2 )^dQ, (12.8.7) 

where and cr 2 refer to the two polar- 
izations of the wave, and dQ is the 
solid-angle element. When o x = o 2 = 
cr, that is, the cross sections are inde- 
pendent of direction, we have cq + 

a 2 = 2cr and ^ dQ = 4jt, hence the fac- 
tor 8jx in the simplified formula 
(12.8.4). 

Note the dimensions of the various 
quantities: a has the dimensions of m 2 
and c/v = X the dimensions of wave- 
length, m, so that the quantity g/X 2 
in the integral 

w= ( (12.8.8) 

is dimensionless. What is important is 
that a depends strongly on the fre- 
quency — light absorption is a resonance 
process. The reader will recall the results 
obtained in the theory of oscillations: 
the absorption of energy by a damped 


oscillator depends on frequency via the' 
following formula: 


A oc 


1 

(V — v 0 ) 2 + 7 2 ‘ 


(12.8.9) 


It has been found that in the case of an 
atom at rest the absorption of light, 
obeys a similar formula: 12 * 7 


X* y 2 
2ji (v — v 0 ) 2 -f- y 2 * 


(12.8.9a> 


We see that the maximum cross sec- 
tion of absorption of light is of the 
order of the square of the light’s wave- 
length. For visible light this is many 
times the size of the atom: the yellow 
line of sodium has a wavelength of 
X = 5400 X 10 -10 m, while the radius, 
of an electron orbit is of the order of 
10 -10 m, which means that the resonance 
cross section of absorption of a photon 
is greater than the area limited by the 
electron orbit by a factor of several 
millions. Here a photon manifests itself 
not as a point-like particle flying along 
a straight line and either hitting the 
target or missing it. According to the 
laws of quantum mechanics, the move- 
ment of a photon is determined by the 
wave equations of the electromagnetic 
field, and in this lies the possibility 
for an atom to capture a photon over 
a distance of the order of the light’s* 
wavelength. 

Up till now we emphasized the similar- 
ity between the approximate formula 
(12.8.1a) for the equilibrium radiation, 
density and the exact Planck radiation 
formula (12.8.2) (note that hv kT ). 
But in what respects does the exact 
formula differ from the approximate; 
in other words, how should we modify 
the expression for dlldx so that the* 
equilibrium radiation density cor- 
responds to the Planck radiation 
formula? The answer to this question 
was given by Einstein in 1917. As 
often happens in physics, the answer 
proved to be easier than formulating 
the question correctly. 


12 - 7 Substituting (12.8.9a) into the integral 
on the right-hand side of (12.8.8), we get 
w — 
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Einstein’s answer was as follows. We 
xeplace Eq. (12.6.5), 

dl r , 

~on 0 I + Kn i, 

with the following: 

^=-<m 0 / + x» 1 (H-q>), (12.8.10) 

where according to what was established 
earlier x = chvo. If cp is the average 
number of photons in a given ray (see 
Section 12.7), then 

l = chvy, ^ = chv^. ( 12 . 8 . 11 ) 

Substituting this into Eq. (12.8.10) 
yields 

= chvo[ — n^+rii (l + <p)]. 

ax ax 

( 12 . 8 . 12 ) 

First let us see whether we achieved 
our goal. Indeed, the condition I = 
/ eq , that is, dlldx = 0, leads to the 
following formula for the equilibrium 
number of photons: 

c Peq __ ni _ e _ hv /kT 
1 + <Peq n o 

that is, TIq CPeq + Wi (1 + *Peq) ~ *-*r 

-hv/kT a 

= '' i _ e -hv/kT'~" e hv/hT__ i • (12.8.13) 

We have thus arrived at an expression 
for the probability of absorption and 
emission of light which leads to the 
Planck formula. Now let us stop and 
evaluate the price we paid for this 
result. 

In formula (12.8.10) we added to 
the term x^ another term, x^cp, equal 
to onj. Thus we (following Einstein) 
maintain that there exists a process of 
photon emission induced by the pre- 
sence of the light beam and proportional 
to the number of photons, or radiation 
intensity, in the already existing beam. 
This process is known as stimulated 
emission (the word “stimulated” refers 
to the fact that the emission is caused 
by the presence of a light beam in the 
hot gas). 

Stimulated emission appears alongside 
spontaneous emission; the two terms 


in the parentheses in (12.8.10) corre- 
spond to the two types of emission, 
and onj. Accordingly, the equation 
for the variation of the number of 
excited atoms n 1 is 


dni 

~dt 


— wn { 


On±I 

hv 


Gn 0 I 

hv 


(12.8.14) 


In the absence of radiation we have 
only one term, — wn l7 as we wrote 
earlier. However, in the presence of 
light, an atom can not only “jump” 
from the ground state to an excited (or 
upper) state (this fact is represented 
by the term on 0 I/hv on the right-hand 
side of (12.8.14)) but can emit stim- 
ulated radiation by going over from 
an excited state to the (lower) ground 
state (the term — onjlhv in (12.8.14)). 
We also note that the cross section cr 
in the terms describing absorption 
(< on 0 I/hv ) and stimulated emission 
(—on-^I/hv) is the same. If the intensity 
I of the beam is high and we can neg- 
lect the term —wn x describing spon- 
taneous emission in comparison to the 
terms responsible for absorption and 
stimulated emission, the equation for I 
simplifies: 

^.= _a(«o-»,)/. (12.8.15) 

Obviously, the condition for this is 
the “strong” inequality I >> 7 e q , where 
/ eq is the equilibrium radiation in- 
tensity at a given temperature of the 
gas. 

Let us substitute into (12.8.15) the 
equilibrium concentration of excited 
atoms, — n 0 e~ hv ^ kT . The result is a 
formula for the effective absorption 
cross section: 

-g-= -o(l-e-W hT )n 0 I. (12.8.16) 

When the gas is heated, the effective 
cross section, that is, the coefficient of 
/, decreases not only due to the decrease 
in n 0 but also due to the countereffect 
produced by the stimulated emission, 
which yields an additional factor inside 
the parentheses on the right-hand side 
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of (12.8.16). If hv <C kT, then 
e-hv/kT^ i_*2L and 

1 _ e -hv!kT^ *£.< 1 , 

and the decrease in absorption is con- 
siderable. If there are only two levels, 
0 and 1, so that the total number of 
atoms is n = n 0 + n x , then n 0 drops 
by a factor not greater than two as 
the temperature T is raised from 0 
to oo. 12 - 8 

For example, in radio astronomy a 
marked effect is produced by the trans- 
ition of hydrogen atoms from a low- 
lying excited state to the ground state 
accompanied by emission of radiowaves 
with X = 21 cm (the hydrogen line). 
The energy of the respective photons 
is hv = hclX £ 2 . 6.5 x 10 -34 X 3 X 
10 8 /0.21 ~ 10~ 24 J. If the temperature 
of the hydrogen cloud is 100 K, then 
kT ~ 1.37 X 10- 23 X 100 ~ 10- 21 J, 
and the value of the factor inside the 
parentheses in (12.8.16) is (1 — 
e -hv/kT ) ^ hvIkT 10 -3 (cf. Section 
4 . 8 ). 

What are the qualitative features 
of the radiation produced in stimulated 
emission? The excited atom in this 
process emits radiation that amplifies 
the forward wave (the wave that stim- 
ulates the atom, so to say), since 
otherwise there would be no mechanism 
for compensation of the energy absorbed 
by the atoms in the ground (unexcited) 
state. In this respect, stimulated emis- 
sion differs from spontaneous emission, 
in which the excited atom emits radi- 
ation in all directions and only partially 
in the direction of the forward wave. 

Using the language of classical me- 
chanics, we could say that radiation 
absorption is similar to resonance build- 
up of oscillations (cf. Section 10.5), 
while stimulated emission is analogous 
to damping of an oscillator, since to 


12 • 8 Of course, it would be more correct to 
say that the temperature is raised from 0 to 
T > (E 1 — E 0 )/k, since from the standpoint 
of physics the passage to the limit T-> oo is 
meaningless— the atoms would simply decay. 


damp oscillations we need a force 
that is in resonance with the oscilla- 
tions, that is, has the same frequency. 
The difference between a driving force 
and a damping force lies in the different 
ratios of the phase of the force to the 
phase of the oscillator. 

However, there is no classical picture 
that can completely explain the phe- 
nomenon of stimulated emission. An 
exact and complete theory of stimulated 
emission is possible only if we employ 
quantum mechanics and the quantum 
theory of electromagnetic radiation 
(quantum field theory). The theory of 
stimulated emission enters as an inte- 
gral part into the quantum theory of 
radiation, whose bases were developed 
by P. Dirac in 1927. 

This fact stresses even more the re- 
markable intuition of Einstein who in 
1917 was able to clarify the main fea- 
tures of stimulated emission proceeding 
only from purely thermodynamic con- 
siderations, approximately in the same 
manner as we have done. Note that 
the very notion of light as a stream 
of particles, photons, was first intro- 
duced into science by Einstein in 1905, 
and it was for this discovery that he 
was awarded the Nobel Prize. 


12.9* Lasers 

Stimulated emission opens up a remark- 
able possibility for generating electro- 
magnetic oscillations (light, for one 
thing). 

Suppose that we have a medium in 
which the concentration n x of atoms 
in an excited state is higher than n 0) 
the concentration of atoms in the 
ground state. Under ordinary con- 
ditions of thermal equilibrium, with 
nJriQ = e~ hy i kT , such a situation is 
impossible, since here rijn^ < 1. But 
there are several ways in which the 
reverse can be achieved. 

One method of making n x greater 
than n 0 will be discussed later. Now, 
without going into details, we will 
simply assume that n x > tz 0 , so that 
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the exact equation (12.8.10) can be 
written in the following form: 

-g- = 6/ + xn 1 , (12.9.1) 

with b = g (n-L — n 0 ) positive. The re- 
versal of sign of the coefficient of I 
in the expression for dl!dx changes 
drastically the behavior of the solution. 
The changes in the properties of the 
solution can be compared to those that 
occur in the solution for the concen- 
tration of neutrons in radioactive decay 
when we go over from the subcritical 
mass of the fissionable material (ura- 
nium-235 or plutonium) to the super- 
critical mass (Section 8.9). 

Indeed, if b in the equation for dlldx 
is positive, the light flux will increase 
exponentially as it passes through the 
medium. Neglecting the second term 
on the right-hand side of (12.8.1), we 
find that 

I=I 0 e bx >I 0 , (12.9.2) 

where 7 0 is the intensity of the light 
at the origin of coordinates, that is, 
at x = 0 (at the entrance to the medi- 
um). A definite volume of the gas (the 
substance) with b > 0, say, a pipe 
of length L with the gas in it, operates 
like an amplifier : with a flux 7 0 at the 
entrance, we have a flux I L = I 0 e bL > 
7 0 at the exit which is more powerful 
than the incident flux. 

Now suppose that at both ends of 
the pipe we have mirrors with reflection 
coefficients P and 8. If at the left end 
(x = 0) we have an influx of radiation 
7 0 , then at the right end of the pipe 
we have, as we know, I L = I 0 e bL . 
This flux will be reflected by the 
right mirror (reflection coefficient P), 
and the reflected flux 

|/lI=P/l = /uP^ l (12.9.3) 

enters the pipe from the right (this flux 
must be considered negative since the 
energy is transferred by this flux in the 
direction opposite to the positive di- 
rection of the x axis). So as not to 
have to think about the right sign, we 
put a prime on the respective flux 



(we will write /') and consider only' 
its absolute value. 

The equation for 7' has the form 

l \P-=~bx. (12.9.4). 

We integrate this equation “from 
right to left,” with the “initial condi- 
tion” being the quantity | I' L | = I 0 $e bL 
corresponding to the value x = L (see 
(12.9.3)), and find the flux value /' 
corresponding to x = 0: 

o 

|/;|=|/£| exp(-& jdx) 

L 

= \r L \e bL =I$e 2b t. (12.9.5)' 

After reflection from the left mirror 
(reflection coefficient 8) we have a flux 
(see Figure 12.9.1) equal to 

I'o = pSe 2bL / 0 . (12.9.6) 

Now we must allow for the fact that 
the total flux from left to right is 

/tot = /o + /J» 

where 7 0 is the intensity of the incident 
light, and 7" is the intensity of the 
twice reflected light. On the right-hand" 
side of the expression (12.9.6) for 7" 
we must put 7 tot instead of 7 0 (above 
we did not distinguish between 7 0 , 
(the incident intensity) and 7 tot because 
double reflection was not considered a 
possibility). We then have 

/tot — /<)+/<>“ /o + pSe 26L 7 tot , 
whence 

7 

t0t l-p6e 2bL * 


(12.9.7). 
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From the standpoint of physics we 
can say that at $&e 2bL = 1 the system 
consisting of a pipe with the active 
gas and two mirrors (with reflection 
^coefficients P and 8) is in a “critical” 
state: a stationary state is possible for 
any value of / 0 . If $be 2bL is less than 
unity, the system is “subcritical” (cf. 
Section 8.9). If we ignore spontaneous 
emission, there is only one solution, 
I o = 0, while if we allow for spontane- 
ous emission (that is, the term un x in 
Eq. (12.9.1)), radiation with an inten- 
sity proportional to x^L sets in in the 
system. Finally, if we allow for stim- 
ulated emission, the amplification 
{or gain, as it is called) is even higher: 
it is proportional to the ratio 1/(1 — 
$8e 2bL ). 

From the standpoint of technology 
-such a situation is of little interest. 
-A remarkable situation emerges when 
p 8e 2bL is greater than unity. By analogy 
with a chain nuclear reaction, here we 
*can speak of the system being in the 
‘“supercritical” state. In this case, even 
for a very small initial intensity and 
low spontaneous emission there will 
be an exponential growth in the radi- 
.ation intensity with time, approximate- 
ly proportional to e** 1 , where y ~ 
{c!2L) (P8e 26L — l) -1 . The radiation in- 
tensity will grow, but so will the 
number of excited atoms used up; n x 
:and b will decrease as long as the 
system remains noncritical; the system 
will become critical only at a high 
value of /. 

Thus, in a system with b positive, 
radiation of high intensity can be 
achieved. The radiation is strictly mono- 
chromatic (it is represented by a single 
oscillation wave), with the result that 
it can be focused into a point. A device 
that converts input power into a very 
narrow, intense beam of coherent visible 
or infrared light is called a laser (for 
Zight amplification by stimulated emis- 
sion of radiation) or maser (for micro- 
wave amplification by stimulated emis- 
sion of radiation). 

The applications of a laser beam have 
been described in numerous articles, 


booklets, and books. Here, in a textbook 
on mathematics, we will only briefly 
point out two methods for obtaining 
inverse population, or an active medi- 
um— this is the name that physicists 
use to describe the situation with 
n x > n 0 , that is, b > 0, which is re- 
quired for operation of a laser. 

One method (historically the first), 
which was employed by the Nobel Prize 
winners N. G. Basov, A. M. Prokhorov 
(both USSR), and Ch. Townes (USA), 
consists in sending molecules that are 
in states 1 and 0 into a capacitor 
placed in a vacuum, in which they 
move without colliding with each other 
under an electric field. The electrical 
properties, namely, the polarizabilities, 
of molecules in states 0 and 1 are differ- 
ent. Using this fact, the inventors of 
the laser were able to set up conditions 
in which the molecules in the ground 
state were deflected away from a vessel, 
while the molecules in state 1 gathered 
in the vessel. Thus, in the vessel, n x was 
greater than n 0l which is just the situ- 
ation required for amplification of light. 

The other method, which is more con- 
venient and does not require a high 
vacuum but which historically was de- 
veloped later than the first, consists 
in employing a system with several 
levels: 0, 1, 2, 3. The system “works,” 
that is, emits radiation, using the 
transition onto one of the intermediate 
levels, say, the transition 3 2 with 

the frequency v 3>2 = (E 3 — E 2 )/h. For 
this to happen, we must excite state 3, 
that is, create a high concentration n 3 . 
Here, however, n 3 remains lower than 
n 0 , since it is impossible to create an 
inverse population, that is, to make n 3 
greater than n 0 , by irradiating the gas 
with light of frequency v 3>0 because of 
stimulated emission of light. (Why?) 

But if molecules go over very rapidly 
from level 2 to low-lying levels, 2 1 

or 2 ->• 0, there is a high probability 
that n 2 < n 0 and n 2 < n 3 , that is, 
the system becomes supercritical with 
respect to the emission of light of 
frequency v 3>2 . 

There are also other methods for 
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creating an active medium. A com- 
mon feature of all lasers is the trans- 
formation of input power into a 
highly energetic, plane, monochromat- 
ic electromagnetic wave. If such energy 
transfer is achieved, the energy can 
easily be focused into an extremely 
small volume. With high energy con- 
centration it becomes possible to achieve 


ultrahigh temperatures and pressures, 
which could be used in controllable 
thermonuclear fusion. Today lasers are 
being widely used in surgery, in pro- 
cessing of materials, and in chemical 
reactions, but all this, as we have al- 
ready said, belongs to books on other 
topics. 
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Chapter 13 Electric Circuits and Oscillatory 
Phenomena in Them 


13.1 Basic Concepts and Units 
of Measurement 

In this chapter we consider phenomena 
that occur in electric circuits . The prin- 
cipal elements of an electric circuit are 
resistance , capacitance, inductance, 
and sources of current (voltage). Our ex- 
position is not designed to take the place 
of a standard physics textbook but 
rather to supplement, develop, and 
refine some of the knowledge contained 
in a school textbook. We will therefore 
confine ourselves to a brief review of the 
definitions of resistance, capacitance, 
and so forth and to their units, on the 
assumption that the reader is sufficient- 
ly acquainted with the basic notions 
from the school physics course. 

The quantity of electricity is deter- 
mined as the net charge, or the difference 
between the positive charge and the 
negative charge. We denote it by q . 
The unit for the quantity of electricity 
is the coulomb (abbreviated: C). The 
elementary charge, the charge of the 
proton, is e v — 1.6 x 10~ 19 G, and the 
electron charge is e e = — 1.6 x 10 -19 C. 

Electric current is defined as the 
quantity of electricity flowing in unit 
time through a cross section of a con- 
ductor. We will denote current by /. 
The unit of current is the ampere (abbre- 
viated: A); this is a current in which 
1 coulomb of electricity passes through 
a cross section of a conductor in one 
second. Thus, 1 A = 1 C/s, but it is 
more convenient to say that 1 C = 

1 A-s, since in the SI system of units 
the ampere is a base unit and the cou- 
lomb is a derived unit (see Appendix 5). 

For the (positive) direction of current 
we take the direction in which positive 
charges would "have to move in order 
to produce a given current. Actually, 
in metallic conductors positive charges 
are stationary, and the current flows 
due to the motion of electrons. As a 
rule, a positively charged body is one 


which lost part of its electrons (only 
in rare cases is a positive charge the 
result of a body acquiring positive 
charges). A negatively charged body is 
one which has acquired a surplus of 
electrons. The direction of current is 
opposite to that in which the electrons 
move in a conductor. 

The electric potential of a given point 
is the potential energy that a positive 
charge of 1 C possesses when placed at 
the given point. The electric potential 
of the ground (earth) is taken to be 
zero . Hence, the point of a circuit 
connected to the ground by a metal con- 
ductor (we say it is grounded or earthed) 
has potential zero. The unit of potential is 
the volt (abbreviated: V). The poten- 
tial of a point is equal to 1 volt (1 V) 
if a charge of 1 coulomb placed at this 
point has a potential energy of 1 joule. 
(The reader will recall that the joule, 
the unit of energy and work, is defined 
as the work performed by a force of 
1 N over a distance of 1 m, or 1 J = 

1 N-m = 1 kg*m 2 /s 2 .) The potential 
energy u of a charge q placed at a point 
where the potential is equal to cp is 

»(J) = ?(C)-<p(V). (13.1.1) 

We have to imagine here that q is small, 
because if a large charge (say 1 C) is 
placed at the given point, the potential 
cp will change. For this reason, it is 
better to say that the potential is the 
coefficient of q in (13.1.1), or the factor 
that connects the potential energy of a 
charge and the quantity of electricity. 

The work A performed by a field in 
transferring a charge from a point where 
the potential is equal to cp 2 to a point 
where the potential is cp 2 is 

A = u, x — u ^ — qhV 

Just as in mechanics only the difference 
of potential energies enters into all 
physical results, so in electricity the 
formulas always involve a difference of 
potentials but never the potential ener- 
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Figure 13.1.1 



Figure 13.1.2 


gies themselves. There will be no change 
in the potential difference if to all po- 
tentials at all points we add an identical 
summand. It is therefore possible to 
choose the potential of any point in 
a circuit or a piece of equipment in 
arbitrary fashion, say, set it equal to 
zero. However, after this has been 
done, the potentials of all other points 
become quite definite quantities. It is 
precisely for this reason that we can 
take the ground potential as zero. 

Let us consider a capacitor (Figure 
13.1.1) consisting of two parallel plates, 
A and B. One of the plates (say A) can 
be connected to some source of voltage. 
The quantity of electricity on this plate 
is directly proportional to the potential 
difference across the plates of the ca- 
pacitor, 

Qa = Ccp c , 


ever. 13 * 1 A change in the charge on 
plate A of the capacitor is due to the 
fact that a portion of the charge left 
the plate and moved to some other site, 
say, point D in Figure 13.1.2. 

If a current j is flowing from D to A 
(from left to right), then in time dt 
a quantity of electricity j dt will flow 
through the cross section of the conduc- 
tor, whence 

dq A — 1 dt, or — i - = /. 

Let us now find out what a current 
flowing in a conductor depends on. By 
Ohm’s law the current is proportional 
to the potential difference across the 
terminals of the conductor, the current 
flowing from higher to lower potential. 
Thus, 

/ = &(cp D — <pj = (q> D — q> A ). (13.1.2) 


where cp c is the difference of potentials 
defined as the potential of plate A 
minus the potential of plate B . Since 
in Figure 13.1.1 plate B is grounded, it 
follows that cp c in this case is equal to 
the potential of plate A. 

The coefficient of proportionality C is 
called the capacitance of the capacitor. 
The unit of capacitance is the farad 
(abbreviated: F). It is the capacitance 
of a capacitor in which the potential 
difference across the plates is 1 volt 
for a charge of 1 coulomb (10 " 6 F is 
a microfarad (jiF), 10~ 9 F a nanofarad 
(nF), and 10 -12 F a picofarad (pF)). 

An equal quantity of electricity (but 
opposite in sign) accumulates (builds 
up) on plate B: 

q b — — qa — — Ct pc* 

The electric charge of a closed system 
is a conserved quantity, that is, electric 
charges of the same sign never appear 
or disappear in any process whatso- 


The positive quantity k is called the 
conductance. The reciprocal, 1 /&, is 
called the resistance of a conductor and 
is denoted by R. The unit of resistance 
is the ohm (abbreviated: Q), which 
is the resistance of a conductor through 
which a current of 1 ampere is flowing 
when a potential difference of 1 volt 
is impressed across the terminals, so 
that 1 Q = 1 V/A. 

We denote the quantity (p D — cp A 
by (p B . This is the potential across the 
resistance R. The value of cp# is defined 
(just as like that of cp c ) as the left-hand 
potential minus the right-hand potential 
(the current flows from left to right). 
Ohm’s law (13.1.2) can then be written 
as 

/ = - 7 f, or <p r = RJ. (13.1.3) 


13 - 1 The net electric charge of a system 
remains unchanged when two particles of equal 
and opposite charge appear or disappear. 
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Figure 13.1.3 

As a source of voltage in a circuit we 
can take a voltaic cell. There is a de- 
finite potential difference across the 
terminals of the cell. We can assume, 
roughly, that the potential difference 
is independent of the current flowing 
through the cell. In particular, in a cell 
the current can flow from a low poten- 
tial to a high potential. Through a re- 
sistance, the current always flows from 
u high potential to a low potential, 
like water in a tube connecting two 
vessels flows from high level to low 
level. 

A voltaic cell is like a pump that 
can take in water in a low-level vessel 
and pump it up to a high-level vessel, 
that is to say, make the water move 
uphill. To operate the pump we need 
some kind of external source of energy. 
The same applies to the cell. When 
the current flows from low to high 
potential, chemical reactions take place 
in the cell. The energy of these chemical 
reactions in the cell is transformed into 
electric energy. 

The potential difference which the 
cell yields is termed the electromotive 
force , which we abbreviate to emf. 

The potential difference across the 
cell taken as the left-hand potential 
painus the right-hand potential (Fig- 
ure 13.1.3) is equal to minus the electro- 
motive force of the cell: 

<Pi — qp2 = —E. 

In reality, the emf is slightly dependent 
on the current flowing through the cell. 
When the current is flowing (from left 
to right in Figure 13.1.3) in the direc- 
tion from low potential to high potential 
(which is the normal operating con- 
ditions of the cell when it is generating 
•electric energy), the emf E diminishes 
with increasing current flow. Approxi- 
mately, we can take it that the emf is 
constant, but more exactly we must 


/ ^ 2 

R 

Figure 13.1.4 

allow for a linear dependence of E on /: 

E = a - bf , (13.1.4) 

where the (positive) coefficients a and b 
are characteristics of the cell. We call 
a cell whose emf is independent of the 
current /, that is, b == 0 in (13.1.4) an 
ideal cell. 

Let us consider a series connection 
of an ideal cell with an emf equal to a 
and a resistance b (see Figure 13.1.4). 
Then 

We = cpi — 92 = —a, 

<Pb = <P 2 — q>3 = bj; 
whence 

<Pi — 9.3 = (<Pi — cp 2 ) + (<P2 — cp 3 ) 

= — - a + bj = — ( a — bj) 

= —E. 

We have again arrived at formula 
(13.1.4) for the emf E , whereby b is 
often called the internal resistance of 
the cell. The real cell, with which we 
usually deal, which has an emf that 
satisfies formula (13.1.4), yields the 
same dependence of E on j as the series 
connection of an ideal cell and a re- 
sistance b. The name for a remains the 
same, the emf of a real cell, bearing in 
mind that E = a when j — 0, and 
the emf drop for j =/= 0 is characterized 
by b . 13,2 

In the sequel, when considering elec- 
tric circuits involving current sources, 
for example a cell, and various resist- 
ances, we can imagine that we are 
dealing with an ideal cell with constant 
emf independent of the current, while 
the internal resistance b may be com- 
bined with the external resistance R . 
Thus, a real cell with internal resistance 
b connected in series with a resistance R 


13 • 2 Since the current is zero for an open 
circuit, the emf may be defined as the poten- 
tial difference across a disconnected cell. 
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is equivalent to an ideal cell connected 
in series with a resistance R 1 = R + 6 . 
The current flowing in the circuit is 
then given by the formula 

. E 

R + b * 

It is worth once again paying special 
attention to the difference between re- 
sistance and source of voltage. If in 
a circuit there is a potential difference 
across a resistance, such that cp 2 > <Pi 
(Figure 13.1.4), then, by our definition, 
<Pn — <Pi — cp 2 < 0, that is, cp# is 
negative. Hence, by formula (13.1.3), 
the current is negative, too, which 
means that the current flows from right 
to left, from point 2 to point 1. Now 
suppose that there is a potential differ- 
ence of the same sign across the ter- 
minals of the voltage source, and the 
dependence of E on j is given by formu- 
la (13.1.4) (see Figure 13.1.3). Here, 
let cp 3 > cp x but cp 3 — cpi < a . Then 
bj = cpi — cp 3 + a = a — (cp 3 — cp x ) > 0, 
ys, 

flows from left to right despite the 
fact that the potential cp x on the left 
is less than the potential cp 3 on the 
right. Thus, the voltage source is ca- 
pable of overcoming the potential 
difference and yielding a positive cur- 
rent (from left to right) for a negative 
potential difference ((p x — 93 < 0 ), pro- 
vided that this negative potential dif- 
ference does not exceed in absolute 
value the emf of the source. Yet for 
a negative potential difference, the 
resistance always yields a negative 
current. In any circuit containing only 
resistance, that is, circuits without 
emf, the current can only be identically 
zero, while if there are cells, that is, 
emf’s (Figure 13.1.5), a nonzero solu- 
tion (current) is possible. 

Now let us consider inductance . The 
phenomenon of inductance is connected 
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Figure 13.1.6 

with the magnetic field that is produced 
in the space surrounding a conductor 
that carries a current. The magnetic 
field is particularly high if the conduc- 
tor has the form of a coil with a large 
number of turns. The field is further 
increased if the coil is wound on an 
iron core. 

The magnetic field, in turn, gives 
rise to electric phenomena (if the mag- 
netic field varies). As we know, given 
a varying magnetic field, each turn 
(even every portion of a turn) of the 
coil becomes a source of voltage, some- 
thing like a voltaic cell. In a coil in 
which the turns are wound so that the 
current traverses the core of the coil 
in the same direction throughout the 
length of the coil, all these voltage 
'bvmwb art vuin wiladi vn 'sulyus ou 
the overall voltage builds up (the volt- 
ages are additive in a series connection). 

On the whole, a coil is equivalent 
to a voltage source with a potential 
difference proportional to the rate of 
change of the magnetic field. But the 
magnetic field in a coil is proportional 
to the current flowing in the coil. 13 3 
For this reason, the rate of change of 
the magnetic field is proportional to 
the rate of change of current flow, that 
is, to the derivative dj/dt . Referring to 
Figure 13.1.6, we find that in the coil 

2=^’. o 3 - 1 - 5 ) 

and the positive direction of the current 
is taken to be from point 1 to point 2 
inside the coil, while the quantity 
is the potential difference across the 

13 • 3 We will not discuss the case of two 
coils wound on one core, which is a transform- 
er connecting two electric circuits that carry 
different currents. Neither do we consider 
cases of a more complicated dependence of the 
magnetic field on the current when an iron 
core is inserted in the coil and the current is 
so great that the iron is saturated. 
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coil (it is defined as the potential cp x 
on the left minus the potential cp 2 on 
the right). When considering in detail 
the direction of the magnetic field and 
the emf induced by its variation, it is 
found that the coefficient L (the in- 
ductance) is always positive. 

From formula (13.1.5) it follows that 
if dj/dt is negative, then cp x — cp 2 is 
negative, too, that is, q ) 2 > <p x . Thus, 
if the current is positive (flows from 
1 to 2) and decreases in magnitude, the 
coil plays the role of a cell sustaining 
a positive current in the circuit, despite 
the fact that cp^ is negative. But if the 
current is positive and increasing, djldt 
is positive, and so cp L is positive, too. 

In this case, the coil plays the part of 
an additional resistance, since the po- 
tential difference across the coil is 
positive for a positive current (cf. 
(13.1.3)). 

A coil differs essentially from a voltage 
source and from a resistance in that 
the quantity 9 ^ depends not on the 
current intensity / but on the rate of 
change of the current, djldt . 

The coefficient L in Eq. (13.1.5) bears 
the name “coil inductance” (also self- 
inductance ). 13 - 4 The unit of inductance 
is the henry (abbreviated: H). If the 
inductance of a coil is equal to 1 henry, 
this means that when the current is 
changing at a rate of 1 ampere per 
second, a potential difference of 1 volt 
is induced in the coil. We obtain the 
dimensions of inductance from for- 
mula (13.1.5): 

1H = 1 V-s/A. 

From the foregoing it is clear that 
i n j ji J*h j p a* 

circuit just like an inert mass (flywheel) 
affects velocity: inductance impedes 
any change in the current, and a mass 
(by Newton’s second law) tends to 
impede any change in velocity. This 


13 * 4 Often instead of saying “a coil with 
inductance L" we simply say “inductance V\ 
just as we say “capacitance C” instead of 
“a capacitor with capacitance C\ or emf E 
instead of mentioning a voltaic cell or a volt- 
age source. 
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Figure 13.1.7 

similarity will be discussed in more 
detail in Section 13.8. 

From the standpoint of subsequent 
computations, capacitance, resistance, 
emf, and inductance have one thing 
in common, they all require two ter- 
minals for connection in a circuit (un- 
like, say, a transformer, which requires 
four leads, or a transistor, which has 
three leads: collector, base, emitter). 
Devices with circuit connections involv- 
ing two leads are called two-terminal 
networks, while if there are four leads, 
they are two-terminal pair networks 
(or four-pole networks). Each circuit 
element— capacitance, resistance, emf, 
and inductance— is characterized at any 
given time by a specific current passing 
through it and a definite potential 
difference at input and output. 

We can imagine a closed box (labelled 
“Box” in Figure 13.1.7) with two leads, 

A and B, sticking out of it. The interior 
of the box may contain anything: 

R, E, L, and C. Connect an ammeter A 
and a voltmeter V. With the circuit 
connections as shown in Figure 13.1.7 
(the “+” and “ — ” signs correspond to 
the labels at the terminals of the ammeter 
and voltmeter), the ammeter records 
the current / passing in the direction 
from point A to point B , the voltmeter 
.indicates . the potential difference 
cpBox = 9 a — 9 b* The relationship 
between 9 e 0 x and / depends on what 
is inside the box: 

in the case of a resistance R, 

<Pbox = Rj, (13.1.6) 

in the case of an emf 13 - 5 E 0 

cpBox = —E 0 , (13.1.7) 

13 • 5 The internal resistance of the emf 
source is disregarded. 
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in the case of an inductance, L, 


<p Bo * =£§, (13.1.8) 

in the case of a capacitance 13 * 6 C, 

t 

^Box = ( < P BO x)« + 4 (13. 1 .9) 
<0 


or 

^BOX = J_f 

~dT ' c 1 • 


(13.1.9a) 


There are of course cases in which 
more complicated relationships are in- 
volved. For example, a rectifier (a va- 
cuum-tube diode or a semiconductor 
diode) does not fit any of the formulas 
(13.1.6)-(13.1.9). However, in a large 
number of important problems we can 
confine ourselves to considering the 
circuit elements for which these for- 
mulas are valid to a high degree of 
accuracy. These are the circuits that 
we will investigate (with the exception 
of Section 17, where we give special 
consideration to the properties of a cir- 
cuit with a device that exhibits a com- 
plicated relationship between current 
and potential difference). 

Let us study the circuit shown in 
Figure 13.1.8. We will first write down 
the voltage drops on the separate ele- 
ments of the circuit: 


<Pc — <P a c — <Pb c , cp R = cp Afi — cp Bf? , 

(13.1.10) 

<Pl = <Pal — <Pbl> Ve = < Vae ~ 9 be • 
And observe that cp B c = (Par, cp b r = 

* 6 Here q A = C7cpB 0X and dq A /dt = /, 
whence dy Bo Jdt = j/C. If at the initial time 
i — i 0 we have cp Box = (cp Box ) 0 , theE 


t 


'(PBox — ((PBox)o + C 1 j j 


dt. 


to 


9 al> an d cp bl = cp ae- Whence, adding 
all the equations in (13.1.10) termise, 
we obtain 

?C + (PR + (Pl + CPje; = Cp Ac — (p BjE . 

If the circuit in Figure 13.1.8 is 
closed, then cp Ac = cp BjE . In this case, 
consequently, 

<Pc + cpR + cp L + cpjp = 0. (13.1.11) 

This general equation, together with 
the expressions (13.1.6) to (13.1.9), 
fully describes all the processes that 
occur in the circuit. Below we will use 
this equation to examine a variety of 
circuits beginning with the very sim- 
plest which consists of only two ele- 
ments. 

13.2 Discharge of a Capacitor Through 
a Resistor 

Let us examine the process in a circuit 
with capacitance C and resistance R 
(Figure 13.2.1). We denote by cp the 
potential of point A (plate B of the 
capacitor will be grounded). To begin 
with, let cp = cp 0 . The corresponding 
quantity of electricity on plate A is 
Qa Ccp 0 . 

Can we speak of a current flowing 
through a capacitor? A capacitor con- 
sists of two plates separated by an 
insulator (say, air) so that in reality 
an electron cannot go through the 
capacitor, say, from A to B . However, 
if a positive charge is impressed on 
plate A , then plate B will have a nega- 
tive charge, and a positive charge will 
flow out of plate B along the wire (the 
current also goes from left to right). 
Two ammeters, A x and A 2 , one of which 
measures current in the wire connected 
to plate A , the other in the wire con- 
nected to plate 5, will have identical 
readings. What precisely is it that flows 
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(positive charges or electrons) through 
different portions of the electric circuit 
does not interest us, just as we are 
not interested in whether the same 
electrons pass through A 2 that have 
passed through A 2 or not. For this 
reason we will henceforth only speak of 
the current passing through the capacitor 
and will have in mind the current 
flowing in the conductors connected to 
the plates of the capacitor. We can 
speak of current flowing in an electric 
circuit through a capacitance in the 
same way that we speak of current 
flowing through a resistance or an in- 
ductance. The difference lies in the 
different type of relationship between 
current and potential difference as ex- 
pressed by the formulas (13.1.9) and 
(13.1.9a). 

When we close the switch P (see 
Figure 13.2.1), a current 

>' = TT<P* 

will flow through resistance R. By 
Eq. (13.1.11), (p + q)R = 0, whence 
(p# = — cp and so 

7 = — -jf <P* (13.2.1) 

Since current flowing from left to right 
is taken to be positive, it follows from 
(13.2.1) that for cp > 0 the current is 
negative, it flows from right to left, and 
the capacitor becomes discharged. 13 - 7 
Recalling that / = dq/dt (current 
flowing through a capacitance) and 
q = Cep, we find that 

i~ c -i f- (13.2.2) 

Comparing (13.2.1) and (13.2.2), we 
obtain 


dep 


dt 



(13.2.3) 


13 • 7 Observe that in all circuits having the 
form of a rectangle (see Figure 13.1.8 and 
subsequent figures), we speak of the direction 
of current in the upper side of the rectangle; 
the current flow in the bottom side that closes 
the circuit will clearly be in the opposite 
direction. 


We solved a differential equation like 
this in connection with the problem 
of radioactive decay (see Chapter 8) 
and in many other sections of this book. 
If cp = cp 0 when t = 0, then 

cp (t) =y 0 e~ t / RC , (13.2.4) 

whence 

It can be seen from formula (13.2.3) 
that the quantity RC has the dimen- 
sions of time. Let us verify this: 

[R] = Q = V/A - V-s/C, [C] - C/V, 


whence 


IRC] 


V-s 

G 


_C 

V 


s. 


During time t = RC the charge q on 
the capacitor and the current / di- 
minish by a factor of e. 

The discharging process in a capac- 
itor can easily be observed experi- 
mentally. Buy a capacitor with capac- 
itance C = 20 (liF = 20 x 10~ 6 F and 
a resistor with resistance R = 20 MQ = 
20 x 10 6 Q. For an RC circuit of this 
type we get RC = 400 s, which is 
a very convenient time for observation- 
al purposes. 

The quantity RC is known as the 
time constant of a circuit consisting 
of a capacitance and a resistance (recall 
that in the case of radioactive decay 
the mean lifetime was an analogous 
quantity). 

We will consider the problem of 
charging a capacitor through a resistor. 
The circuit diagram is shown in Fig- 
ure 13.2.2. If switch P is closed, then, 
by (13.1.11), cp E + cp R + cp = 0, where 
cp is the potential of the nongrounded 
plate of the capacitor. Since (p s = — E 0 
and cp H = Rj, it follows that — E 0 + 

£° J £ 

f| AM/ 1| 

1 R 

/>* 

X O 

Figure 13.2.2 
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Rj + cp = 0. The current flowing 
through the capacitance is / = dqldt = 
C {dyldt), whereby 

— J?o + i?C-^- + <p = 0, or 

TT---ST<'P- £ ')- (13-2.5) 

To find out howcp varies with time, it 
will be convenient to make the change 
of variable, z = cp — E 0 , with dz = dcp. 

Equation (13.2.5) can then be re- 
written thus: 

dz __ z 
lt'~' ~~ RC- 

ItS solution is 

z=z 0 e~ t / RC , (13.2.6) 

where z 0 is the value of 2 at the initial 
time. 

Let us find the solution for the case 
where at the initial time the capacitor 
is not charged: cp = 0 at t = 0. Then 
Zq = — E 0 . From (13.2.6) we get z = 
— or 

<p = z-{- E 0 = — E 0 e~O RC E q\ 

= E 0 (i -e-tl RC ). (13.2.7) 

The graph of cp as a function of t is 
given in Figure 13.2.3. The curve cor- 
responds to the formula (13.2.7), while 
the dashed horizontal line represents the 
value cp = E 0 which the solution ap- 
proaches with the passage of time. The 
quantity z has the geometric meaning 
of vertical distance from the curve to 
the dashed line. This distance dimin- 
ishes exponentially with the passage 
of time. 

During a time equal to RC , the 
charge of the capacitor reaches 63% 
of its final value, during time 2 RC it 
reaches 86%, and during time 3 RC 
it reaches 95% of its final value. 



Figure 13.2.3 
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Figure 13.2.4 

From formulas (13.2.4) and (13. 2 . If 
it is evident that charging and discharg- 
ing the capacitor is the faster the 
smaller the resistance R and the smaller 
the capacitance C . 

Exercises 

13.2.1. Referring to Figure 13.2.1, C = 
10- 6 F and R = 10 7 Q, R = 10 8 Q, R = 10 9 Q. 
For each of these cases, determine the time’ 
lapse during which the current flowing 
through the capacitor at the initial time falls^ 
off by 10%; decreases by a factor of 2. 

13.2.2. Consider the process of equalizing 
the potential across a resistance R in series 
with two capacitors, C 1 and C 2 , one of which 
at time t = 0 is charged to a potential differ- 
ence cpci (0) = a, while the other is not charged 
at all, that is, (pc 2 (0) = 0 (Figure 13.2.4). 

13.2.3. Determine the variation of the time 
constant of the circuit depicted in Figure 13.2.1 
if all linear dimensions of the circuit diagram, 
are increased n-fold (for the case of a plane- 
parallel capacitor). (The condition of the prob- 
lem is to be understood in this way: the di- 
mensions of the capacitor and the resistance are 
increased but the materials of which they are* 
made are not changed.) [Hint. The formula, 
for the capacitance of a plane-parallel capac- 
itor is known from physics: C = & S / 4 n d, 
where S is the area of a plate of the capacitor, 
d the distance between the plates, and e a 
constant characterizing the material between 
the plates (the dielectric constant). The re- 
sistance of a wire resistor is found from the for- 
mula R = pZ/cr, where l is the length of the- 
wire, a the cross-sectional area of the wire, 
and p a constant characterizing the material 
of the wire.] 

13.3 Oscillations in a Capacitance 
Circuit with Spark Gap 

A typical circuit diagram involving- 
a capacitance is shown in Figure 13.3.1. 
The circuit includes a voltage source- 
with emf E and resistance R (the role- 
of R may be played by the internal 
resistance of the voltage source) Under- 
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T Figure 13.3.1 

neath is a spark gap. For a potential 
difference less than a certain value cp x , 
the spark gap is an insulator. At cp = cp x 
a spark jumps the gap and the air be- 
tween the wires heats up and becomes 
•a good conductor. We denote the total 
resistance of the leads and the incan- 
descent air by r. The quantity r is 
small and remains small as long as a 
current flows maintaining a high tem- 
perature of the air. For a definite small 
value of current / 2 the air cools and 
the spark gap again becomes an insu- 
lator. This current value is associated 
with the potential difference (p 2 = h r - 
Here cp x > <p 2 : a higher voltage is 
.needed to initiate a spark than to keep 
it burning. 

Figure 13.3.2 shows the dependence 
of cp on t for such a circuit. The capacitor 
is charged over the section OA, with 
no current flowing through the spark 
gap. In this case the solution (13.2.6) 
is valid: 

<p = £(l -e-V™). (13.3.1) 

The potential difference at point A 
at time t = t A reaches the value cp l7 
the spark gap begins to conduct cur- 
rent, and the capacitor discharges. 
Since in this case R > r, the current 
from the voltage source can be ignored 
as compared to the current passing 



through the spark gap. Therefore, for 
cp we get the equation 

dtp _ cp 

dt ~ rC ’ 


with cp = cpi at t — t A , whence we 
obtain 

<p = cp 1 e -(t-< ^ )/rC . (13.3.2) 

At time t = t B (point 5), cp = cp 2 , 
the spark gap again becomes an insu- 
lator, the charging process is initiated 
(section BC ), and so on. 

Let us determine the time t B — t A 
during which the capacitor discharges. 
To do this, we take advantage of the 
fact that cp = cp 2 at t = t B . Putting 
cp = cp 2 and t = t B in (13.3.2), we get 

whence 

t B — t A = rC In (q>]/cp 2 ). 

Over segment BC (charging) the rela- 
tion (13.3.1) displaced in time by the 
amount t holds true (in Figure 13.3.2, 
t is depicted by the line segment A Q B). 
For this reason 

cp = E (1 — e-(f-t)/RC) # 

Putting t = t B , we get 
cp 2 = £’(l— e _(<B ~ T)/RC ). 


Similarly, setting t = t c , we find that 
<Pj — i?(l — e- (<c-Tl/RC ). 


Combining the last two formulas, we get 


| =I 2s . =e -«c-*BVM t or 
E— <Pi ’ 


t c — t B = RC In 


£— q>2 
E — q>r 


The complete period (the charge- 
discharge cycle) is 

T = t c -t A = (t c -t B ) + (t B -t A ) 


= RC In 


g— q>2 
E - (Pi 


f rC In . 

q>2 


Ordinarily, the resistance R in the 
circuit of the voltage source is many 
times greater than that of the spark 
gap, and for this reason the charging 


Figure 13.3.2 
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Figure 13.3.3 

time is much longer than the time of 
discharge. On the other hand, the 
discharge current is many times greater 
than the charging current, greater than 
the maximum current obtainable from 
the voltage source (with an internal 
resistance of R 1 the voltage source does 
not produce a current exceeding E/R x ). 
Thib 'vrrciiih °Juwn ; ru F’fgim? 'trL.Ux 
transforms a long-term small current 
generated by the voltage source into 
a strong current, which, however, is 
of short duration (it is customary to 
speak of “short pulses” of current). 

This circuit operates like a system in 
which a tiny flow of water gradually 
fills a vessel (Figure 13.3.3). The vessel 
is fixed in such a manner that when 
a sufficient amount of water has accu- 
mulated, the vessel turns over and 
the water pours out. The vessel then 
rights itself and the process begins 

°cSv?^v . T i\x firmed* 

on a horizontal axis 00' below the 
midpoint. A weight is attached at the 
bottom of the vessel so that the center 
of gravity of the empty vessel lies below 
the axis. But when the vessel fills up 
with water, the center of gravity of 
the full vessel lies above this axis and 
the vessel tips over. 

Let us return to the circuit diagrams 
in Figures 13.2.1 and 13.2.2. In these 
circuits, which consist of capacitances, 
resistances, and emf sources, the po- 


tentials even out in the course of time. 
Indeed, in Figure 13.2.1 the value 
cp = 0 sets in, and in Figure 13.2.2, 
cp = E 0 is the steady-state value (see 
formulas (13.2.4) and (13.2.7)). 13 8 The 
situation is quite different in the case 
of a spark-gap circuit. Here we have 
undamped oscillations of cp (true, they 
are very different from those that we 
studied earlier). These oscillations are 
connected with certain specific prop- 
erties of the spark gap, in particular 
with the fact that until a definite 
potential is reached (the breakdown 
potential cp x ), no current flows through 
the spark gap. 

Many books have been written about 
the properties of discharge through air 
in a spark gap. All we have given here 
is a smattering of information — only 
enough to understand the operation of 
the circuit shown in Figure 13.3.1. 
This information does not even suffice 
to answer the simple question: What 
will happen if we connect the spark gap 
to a voltage source without a capacitor? 

LirUwh , ; L nm ''tvrreih firowb, + hfb T v\)i-c- 
age across the spark gap will be E 0 . 
Since E 0 is greater than cp l7 breakdown 
should occur. But if this occurred, the 
resistance of the spark gap would 
become small, equal to r. Then a 
potential difference equal to E 0 r/(r + R) 
would appear across the spark gap and 
the current would be j = E 0 /(r + R)- 
If R is great, the current / is small, 
less than ; 2 , and the potential difference 
across the spark gap is small, less than 
cp 2 . But then the air will not heat up 
and the resistance of the spark gap 
wi 1L "xivl u + bx^ "mpAL , 

which means the potential difference 
will be great and equal to E . We have 
a contradiction. 

Actually, under these conditions we 
have an electric discharge of a different 
type, the glow discharge (small current 
without heating of the air), instead 
of the spark with incandescent air. 


13 - 8 Below we will see that the potential 
tends to a steady-state value as oo in 
a circuit with an inductance, too. 
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13.4 The Energy of a Capacitor 

A charged capacitor has a definite 
supply of energy, which can be given 
up very quickly if the capacitor is 
discharged through a small resistance. 

Let us find the supply of energy of 
a capacitor of capacitance C, one plate 
of which is grounded and the other 
has a potential cp 0 . Then the quantity 
of electricity q 0 = C cp 0 . 

It would appear at first glance that 
the energy is equal to the product 
g 0 cpo (= Ccp^). In reality, this expres- 
sion is not exact, though it is correct 
as to order of magnitude: it differs 
from the true value by a factor of 2. 

Let us consider the charging of the 
capacitor. When the capacitor’s po- 
tential is cp and the charge is q , the 
addition of a small quantity of electri- 
city dq increases the energy by 

dW — cp dq . (13.4.1) 

The essential thing is that during 
charging the potential cp changes, since 
cp = qlC . Substituting this value of cp 
into (13.4.1), we get 

dW — ~q dq. (13.4.2) 

Integrating (13.4.2) from q = 0 (un- 
charged capacitor) to q — q 0 , we get 

q o 2 

W{q») = ±)qdq = ±-^ 

0 

= -f = 4 C( P»- < 13 - 4 - 3 ) 

Thus an exact evaluation yields the 
coefficient 1/2. 

Now let us examine the charging of 
a capacitor from a voltage source 
through a resistance (see Figure 13.2.2). 
The voltage source has a constant emf, 
E 0 . Therefore, when a quantity of 
electricity dq flows, the voltage source 
does work E 0 dq (this work is performed 
at the expense of the chemical energy 
of the voltage source, that is, the 
chemical energy diminishes). Hence, the 
total work done by the voltage source 
is equal to E 0 q 0 , where q 0 is the total 


quantity of electricity that has flowed. 
When the capacitor is being charged, 
the process stops at cp = E 0 . In this 
process, the voltage source will have 
performed work 

E»q« — E 0 CE 0 — CE\. 

What supply of energy will the ca- 
pacitor possess? This can readily be 
computed from formula (13.4.3): 

W = ^CE\. 

Where has half the work performed 
by the source gone? We will show that 
it went to heat up the resistance R. 
Recall that if a quantity dq of elec- 
tricity flows through a resistance, the 
energy released will be 

dA 1= cp R dq , (13.4.4) 

where cp R is the potential difference 
across the resistance. Using the fact 
that dq = j dt and j = cp r /R, we can 
transform (13.4.4) to the familiar form 

dA = -^-dt = j 2 R dt. 

The quantity pR = cp r/R is the amount 
of energy released on the resistance in 
unit time, which is to say, it is the 
power output of the resistance. It is 
measured, as expected, in watts: 

[ fR] = A 2 -Q = A 2 - (V/A) = A-V 
= A-(W/A) = W. 

The time dependence of j in the case 
of the charging of a capacitor through 
a resistance was found in Section 13.2: 

/ (t) = -jf e~ tlRC ■ 

Therefore dA = {E\!R) e~ 2 t/ RC dt. 

The energy released during time T is 

T 

A (T) =^- j e~ 2t ' RC dt, 

0 

whence 

A(T)= e~ 2t/RC 

= ~-CE\{\ — e~ 2T l RC ). (13.4.5) 
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We know that as the time interval T 
increases without limit, the potential 
<p approaches the value E 0 . As can be 
seen from (13.4.5), A then approaches 
CE\!2. Therefore, the total energy re- 
leased on the resistance is 

A = ±CEl (13.4.6) 

Thus, calculations confirm the fact 
that in the charging of a capacitor 
half the energy is lost on the resistance. 
The efficiency of charging is only 50 % . 
Note that if we directly connect the 
voltage source to the capacitor, nothing 
will change, the efficiency remains at 
50%, the role of resistance being taken 
by the internal resistance of the voltage 
source, which will then heat up. From 
formula (13.4.6) it is evident that the 
energy lost uselessly on the resistance 
in charging the capacitor is independent 
of the magnitude of R and hence does 
not depend on how fast the charging 
takes place. 

Since R did not appear in (13.4.6), 
this formula may be obtained without 
introducing R into the intermediate 
transformations. Indeed, for the circuit 
of Figure 13.2.2, <p B + cp fl + cp c = 0, 
whence cp H = — cp E — <p c = E 0 — qlC. 
Therefore dA = (E 0 — q/C) dq. Inte- 
grating this expression from q = 0 to 
q = Qo = E 0 C yields 

A = ±CE%. 

The last derivation holds true also for 
the case where the resistance R varies 
with time, while the previous deriva- 
tion^ hpJiLtrau^anl^ fan. 9* const ant. , sinco, 
only in this case could the formulas of 
Section 13.2 be applied. 

In order to reduce losses in charging 
a capacitor, we should have done as 
follows: first take a voltage source 
with small emf E x and charge the 
capacitor to the potential E x , then 
disconnect the first voltage source and 
connect a second source with greater 
emf E z . Having charged the capacitor 
to potential E 2 , disconnect the second 
source and connect a third with emf E 3i 



and so on. The gain is clearly seen in 
the graph: lay off the charge q of the 
capacitor on the axis of abscissas and 
the potential cp on the axis of ordinates. 
The two are connected by the relation 
cp = qlC, which represents a straight 
line (Figure 13.4.1). The energy of 
a capacitor is equal to the area of the 
triangle OAB. The work done by the 
voltage source is equal to the area of 
the rectangle OABD. The energy lost 
on the resistance is equal to the area 
of the triangle ODB. 

If the capacitor is charged in stages, 
the sum of the works performed by all 
voltage sources is equal to the shaded 
area in Figure 13.4.2. We leave it to 
the reader to find the efficiency for 
the case where the charging process is 
divided into n stages: 


E — - 2 - F — 2( ^ 

^ i i ^2 — — y 


E* 


n 

3cp 


, E n =(f>. 


In the foregoing, one plate of the 
capacitor was grounded, that is, its 
potential was a»i A = 0. Here., the energx 
of the capacitor depends on the po- 
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tential of the other plate, (p 2 , that is, 

If neither plate is grounded, the 
energy of the capacitor depends on the 
potential difference across the plates, 

(p c , that is, W = —Cyc- Indeed, we 

know that the charge q on each plate 
of the capacitor depends on the po- 
tential difference, the charges on the 
plates being equal in magnitude but 
opposite in sign: 

Qa — Ccp c , q b — — Ccp c = — £a, 
dq A = — dq B . 

When computing the variation of 
energy in the charging process, one 
has to take into account the variation 
of charge on both plates. Let the po- 
tential of plate A be <p 1? the potential 
of plate B be cp 2 , and cp x — cp 2 = cp c . 
Then 

dw = q> x dq A + <p 2 dq B = <p x dq A 

— cp 2 dq A = (q)! — <p 2 ) dq A 

- cp c dq A . 

But since (p c = q A /C , we can write 
dW = ±q A dq A . (13.4.7) 

Integrating (13.4.7) from 0 to q A yields 

^=#= 4 ^- 

Knowing the expression for the ener- 
gy of a charged capacitor as a function 
of the capacitance, we can find the 
mechanical forces acting between the 
plates of the capacitor. Imagine the 
plates to be connected mechanically 
with some kind of lever and suppose 
that the capacitance C depends on the 
position of the lever. If the position of 
the lever is characterized by the value 
of the x coordinate, the capacitance is 
a function of a;, or C — C {x). At a 
definite position x 0 of the lever the 
capacitance is C (x 0 ) = C 0 . If in this 
position the capacitor is charged to the 
potential cp 0 , the charge on the plates 


is q 0 — C 0 cp 0 and the energy of the 
capacitor is 

TJ/ ffp 

~ ,2 “ 2C 0 * 

Let us disconnect the capacitor from 
the voltage source and move the lever. 
The charge will then remain constant 
(the potential varies in inverse pro- 
portion to the capacitance), and the 
energy will change: 


The electric energy of a charged 
capacitor is similar to the elastic energy 
of a spring: W (, x ) increases if an exter- 
nal force applied to the lever does work. 
Then the external force overcomes the 
forces with which the plates of the 
capacitor act on the lever. Contrariwise, 
if W (x) decreases, the lever is displaced 
and does work in opposition to the 
external applied forces. We may con- 
clude that the force with which the 
plates act on the lever is 

p = dW q% 

dx dx 2 C (x) 

ql dC(x) __ cp 2 (x) dC(x) 

~ 2 [C (x)] 2 dx 2 dx * 

(13.4.8) 

The force is directed toward increas- 
ing capacitance. For instance, if the 
capacitor consists, say, of two identical 
parallel plates, the capacitance is in 
inverse proportion to the distance be- 
tween the plates. This means the ca- 
pacitance increases when the plates are 
brought closer together. Quite true r 
because when a capacitor is charged, 
the charges on the plates are of opposite 
sign and so the plates attract with 
more force if they are close together. 

Formula (13.4.8) enables us to find 
the force in more complicated situ- 
ations, as, for example, in the case of a 
variable capacitor, in which one plate 
can move in and out between two fixed 
plates. 

It is important to note that we took 
the derivative dW/dx for a given con- 
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stant charge q . However, it is not per- 
missible, when seeking the force by the 
formula F = — dW/dx, to take the 
derivative of W = C {x) cp 2 /2 assuming 
cp constant and having regard solely 
for the fact that C depends on x . We 
would then obtain an incorrect sign for 
the force. Indeed, if the capacitor is 
disconnected from the voltage source, 
(p is not constant: cp = qlC and C = 
C (x). If the capacitor is connected 
to a voltage source, then cp remains 
constant as the capacitance varies. But 
then the charge q varies, which means 
that a current is flowing through the 
voltage source, that is, the voltage 
source is doing work equal to cp dq 
(as C increases). Hence, when applying 
the law of conservation of energy for 
constant cp and variable capacitance, 
one has to take into consideration not 
only the variation in the energy of the 
capacitor and the work of the force 
but also the work done by the voltage 
source. 

13.5 Inductance Circuit 

Let us consider a circuit consisting of 
resistance R and inductance L (Fig- 
ure 13.5.1). By formula (13.1.11), 

cp* + <Pl = 0. (13.5.1) 

Since cp# = Rj and cp L = L ( dj/dt ), 
using (13.5.1) we find that 


Fa J 
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Figure 13.5.2 


We see that the current in the circuit 
of Figure 13.5.1 falls oft exponentially. 
The current diminishes e-fold during* 
a time interval T = L/R . 

Let us verify the dimensions of L/R. 
L is measured in henrys, that is, 
V*s/A and R in ohms, or V/A. There- 
fore, 

[L/R] = (V*s/A)/(V/A) = s, 

so that L/R indeed has the dimensions; 
of time. We will call L/R the decaij 
time. In the circuit diagram shown in* 
Figure 13.5.1, in which there is 
voltage source, the current tends to 
zero with the passage of time. How te 
set up an initial current of j 0 will be- 
discussed later on. 

For the present, let us consider a cir- 
cuit consisting of a source having an. 
emf £ 0 , of a resistance /?, and of an 
inductance L (Figure 13.5.2). From the 
condition 

<Pe + 9b + <Pl = 0, 


recalling that cp# = — E 0 , we find that 
-E 0 + Rj + L% = 0. (13.5.4> 


Thus, the current in the circuit of Fig- 
ure 13.5.1 satisfies the equation 


dj_ __ __ . 

dt~~ L h 


(13.5.2) 


The solution to this equation is 
j(t)=j 0 e-W L )t. (13.5.3) 

L L 

w v onrm — 

R 


We rewrite this equation as 

di = JL (Jk—A 

dt L \ R J 1 ' 

This equation is similar to Eq. (13.2.5)' 
and is solved by the very same proce- 
dure. We get 

/ (T) = -^+Ae-WU t , (13.5.5). 

where A is determined by the initial 
condition. Suppose the switch is closed 
at t = 0. Then / (0) = 0 because there 
was no current flowing in the circuit 
when the switch was open. Given this 


Figure 13.5.1 
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condition, we find that A = —EJR, 
and (13.5.5) assumes the form 

j = -|> (l_ e -(R/Df). (13.5.6) 

In the course of time the current ap- 
proaches the value 

nrs'^aruV fs~ Axaepeiia^n tire Aicnlci- 
ance L and is simply obtained from 
Ohm’s law in a circuit with an emf E 0 
and a resistance R. However, this 
current sets in not at once but gradu- 
ally (Figure 13.5.3), and the time re- 
quired for / (oo) to set in depends on L : 
in time L/R the current becomes 
0.63; (oo), in time 2 (L/R) the current 
becomes 0.86/ (oo), in time 3 (L/R) 
the current becomes 0.95/ (oo), and so 
forth. Thus, the ratio L/R can also be 
called the buildup time for the current. 

According to the basic equation 
(13.5.4), the sum of the difference of 
potential across the resistance, Rj, and 
across the inductance, L (dj/dt), is equal 
to the emf E 0 . It is interesting to follow 
each term separately. They are shown 
in Figure 13.5.4. At the initial time, 

/ = 0, R / = 0, and E 0 = L (dj/dt). We 




say that the voltage is absorbed entirely 
by the inductance. As time passes, the 
current approaches the constant value 
EJR , dj/dt tends to zero, and the volt- 
age is absorbed by the resistance. 

It is interesting to compare the solu- 
tions that formula (13.5.6) yields for 
the same E 0 but different values of R 
and L. Let R x be small, R 2 great, L y 
small, and L 2 great. For different com- 
binations of R and L we get four curves 
of current as functions of time (see 
Figure 13.5.5). The final current / (oo) 
depends on R alone; it is the same for 
the pairs, R x , L x and R±, L 2 \ j (oo) is 
also the same for the pairs R 2 , L x and R 2 , 
L 2 . The initial rate of buildup of cur- 
rent depends only on the inductance L 
and does not depend on the resistance: 
the curves specified by the pairs R u 
L x and R 2 , L x and R x , L 2 and R 2 , L 2 
all merge at the point O (0, 0). 

On dimensional grounds, it is clear 
that the steady-state current is pro- 
portional to the initial rate of current 
buildup and the time of buildup. For 
our definition of buildup time, the 
formula is correct without any addition- 
al coefficients. Indeed, the initial rate 
of current buildup, (dj/dt ) t=0 , is equal 
to EJL, and the buildup time, T, 
is equal to L/R, whence the established 
current is 

/ (°°) = T (dj/dt) t= o = ■jjr —£■ = -g . 

Now comes the question we posed 
at the beginning of our discussion of 
setting up the initial current / 0 in the 
circuit in Figure 13.5.1. We can take 
the circuit diagrams of Figure 13.5.6. 
We start by closing switch A and leav- 


Figure 13.5.4 
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Figure 13.5.6 

ing switch B open. Then a current will 
flow and soon attain the value E 0 /(R + 
RJ in accordance with formula (13.5.7). 
We choose E 0 in such a manner that 
E 0 /(R + Ri) = 7o* We wait for the 
steady state to set in when the current 
is equal to / 0 with A closed and B open. 
Then in this state we close B and 
open A. The result is the circuit shown 
in Figure 13.5.1. At the initial instant 
of time (when closing B), a current ; 0 
flows. The potential at point 1 (see 
Figure 13.5.5), prior to closing the 
switch, is cp x = 0, since in the steady 
state, when j 0 is constant, the voltage 
drop on L is zero. Prior to closing the 
switch, the potential at point 2 is 
equal to cp 2 = Rj 0 - When switch B 
is closed, point 2 becomes grounded 
and so the potential at point 2 is 
<p 2 = 0. There is then a corresponding 
readjustment of potentials at all other 
points of the circuit. In particular, the 
potential at point 1 is now (p x = —Rj. 

13.6 Breaking an Inductance Circuit 

Above we considered the process of 
a steady-state current setting in in the 
circuit shown in Figure 13.5.2, which 
consisted of a voltage source, a re- 
sistance j R, an inductance L, and a 
switch. Figure 13.5.3 shows the curve 
of current buildup when the switch is 
closed at time t — 0. In time, the cur- 
rent reaches the value j 0 = E 0 /R . What 
will happen now if we suddenly open 
the switch 5? If the current ceases to 
flow in a very short time t, the deriva- 
tive of the current with respect to time 
can be represented as follows: 

dj ^ j(t + T) — j(t) _/o 

dt T T ’ 

which is to say, that the derivative will 
be very great in absolute value if t 


is very small. And at point A there will 
appear a very large (in absolute value) 
negative potential: “ 

(p A = L^-~— . 

dt T 

The potential difference across the re- 
sistance R (it is equal to Rj) and the 
emf of the voltage source change but 
slightly when the switch is opened. 
For this reason, the great potential 
difference that appears across the in- 
ductance L when the switch is opened 
falls entirely on the switch; by this 
we mean that the potential difference 
across the open plates of the switch 
becomes very great, of the order of 
LjJ t, and can exceed many times over 
the emf of the voltage source, E 0 . If 
the potential difference is great, the 
air gap between the open contacts will 
break down and a spark will jump 
across. 

The problem of current variation in 
a circuit when a switch is opened proves 
to be very complicated; this is due to 
the involved nature of the laws of elec- 
tric discharge in air between plates. 
Indeed, prior to breakdown, when cp < 
cp b , there was no current; but when 
breakdown occurs, the resistance of the 
spark falls drastically, a big current 
flows at a potential difference consider- 
ably less than cp b . Here we only note 
the basic fact: large potential differences 
arise in inductance circuits in circuit 
breaking. When such a circuit is closed, 
the potential difference does not exceed 
E o (the emf of the source) anywhere. 

"Tne 'two circuits “fen own ‘m figure 
13.6.1 give a quantitative idea of the 
phenomenon of sudden brief increases 
in potential difference. These circuits 
differ from that of Figure 13.5.2 in 
that current can also flow in inductance 
L when switch B is open, so that circuit 
breaking occurs without any spark. 
However, if the resistance R is much 
greater than the resistance r, a large 
potential difference appears across the 
inductance at break. 

As an example, let us consider the 
circuit shown in Figure 13.6.1a. We 
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assume R r. If the switch is closed, 
at an arbitrary time the current in the 
left portion of the circuit (r, E) is 
equal to the sum of the currents in the 
parallel connection RL : 

It — Je = Jr + 7l- 


It is then always true that cp L = cp^. 

Let the switch be closed at time 
t = 0. At this instant all the current 
will go through the resistance R, so that 
7ro = 7jro = EJ(R + r), by Ohm’s law. 
Then 


R 


R + r ’ 


TrO “ E 0 » <PrO — Eq 

hence 

d ik = Vro_ r 

dt t= o L L 7? + r 

After a sufficient time lapse after mak- 
ing the circuit, a constant current will 
flow. In the steady state, the entire 
current will go through the inductance. 
Indeed, if the current / does not vary 
with time, then djldt = 0, therefore 
<Pl = 0, and so cp# = 0, whence j R = 0. 

In the steady state, cp r OO E o and 

j T oo = /loo — Ejr, whence it is easy 
to obtain the order of the time 
during which the steady-state current 
sets in: 

dJ l 

'4 


7 LOO — 7lo : 


dt 


t=0 


X* or 


E o R 


L R + r 




whence 




L(R + r) L 

rR ~ R * 


Now let us examine breaking the 
circuit after a time lapse of t^> 
after the circuit is closed, that is to 
say, after a constant current / «, — : Ejr 
has been set up in the circuit. When 
the circuit is broken j r = j E = 0 and 
Jr + Jl = 0, whence j R = — j L . This 
means that the entire current passing 
through L must pass through R in 
the reverse direction. As before, of 
course, (pL^^n* Therefore, cp r = Rj r = 
— RJli or cp L = — Rj l . But since cp L = 
L ( djjdt ), it follows that L (dj L /dt) = 
— R}l- 

We have thus arrived at Eq. (13.5.2), 
which is quite natural since the right- 
hand part of the diagram in Figure 
13.6.1a (after the circuit is broken) 
does not differ from the circuit diagram 
in Figure 13.5.1. 

The current diminishes by a factor 
of e during time x 2 = LIR. Here, 
x 2 T 1? since R r. At the time of 
break, the current has the value j Loo = 
Ejr . After the break has taken place, 
but prior to the current falling off 
perceptibly, that is, for break time 
t < t 2 , we get 

<Pr = <Pl ^ — Rj L oc = —E+. 


Thus, at break we can obtain a po- 
tential difference that is many times 
greater than the emf of the voltage 
source. This fact is extensively employed 
in engineering, in particular in the ig- 
nition systems of internal combustion 
engines. Observe that this large poten- 
tial difference occurs over an extremely 
small time interval. 

The foregoing is a rough considera- 
tion of the problem without the use of 
derivatives and other tools of higher 
mathematics. An exact consideration 
of the problem of closing a switch in 
the circuit of Figure 13.6.1a yields the 
following. Proceeding from the rela- 
tions 

<Pe + <p r + <Pi = 0, j = j R + j L , 

Wr — <p Li 



13.7 The Energy of Inductance 


435 



Figure 13.6.2 

we get the differential equation 

dj L i rR . = E 0 R 
dt (R + r)L jL (R + r)L * 

At the initial time, t = 0, the current 
flowing through the inductance is zero: 
f L = 0 at t = 0. Therefore, 

= ~r~ { 4 eX P (F^)T*]} 

= (13.6.1) 

The current in the circuit is 

J = Jl + J 

+ ( 13 - 6 - 2 ) 

In Figure 13.6.2 the approximate solu- 
tion is shown by the broken line and the 
exact solution (13.6.2) by the smooth 
curve. 

We advise the reader to examine the 
process of variation of current and po- 
tential difference at make and break 
in the circuit of Figure 13.6.16. It is 
useful to solve the problem twice: once 
by setting up the differential equation 
and seeking its solution in the form of 
an exponential function, and the second 
time in the approximate fashion, as we 
did for the circuit depicted in Fig- 
ure 13.6.1a. 

13.7 The Energy of Inductance 

We have seen that in a circuit con- 
sisting only of an inductance L and a 
resistance R, current continues to flow 
after the voltage source has been dis- 
connected. The current gradually falls 
off in time. In the process, a quantity 


of heat Q = Rj 2 is released on the re- 
sistance in unit time. 

What is the source of the electric 
energy that is converted into heat in 
the resistance? The energy is given up 
by the inductance, which has a certain 
supply of energy. 

Let us find this supply of energy by 
considering the elementary circuit dia- 
gram shown in Figure 13.5.1 and let us 
calculate the entire thermal energy re- 
leased onR. Suppose at the initial time, 
t = 0, this circuit has a current / 0 . The 
current will then decay in time in ac- 
'\nndrcmv5b ’w’Jhi + hfb Vaw 

j ( t ) = j 0 e- Rt f L . 

The quantity of energy released on the re- 
sistance R in unit time, that is, the rate 
of energy release (dimensions: W/s), 
is the instantaneous power output h. 
Using the time dependence of / given 
above, we find that 

h = Rf =•-- Rfie- * Rt ' L . (13. 7.1) 

Knowing h, it is easy to find the total 
amount of heat released from time t = 0 
to infinity ( t = oo), that is, to complete 
decay of the current. To do this, it 
suffices to integrate (13.7.1) from t = 0 
to t — oo. This yields 


Q= ( Rjle- 2Rt ' L dt = 
0 

RilL __ L)% 

2 R 2 * 


oo 

R)\ f e~ 2Rt / L dt 


0 

(13.7.2) 


This heat is equal to the supply of 
energy of the inductance through which 
the current j 0 flows. This supply of 
energy does not depend on the magni- 
tude of the resistance R. An inductance 
L with current ; 0 has a definite supply 
of energy which, ultimately, is trans- 
formed completely into heat, irrespec- 
tive of the magnitude of the resistance 
R , which only affects the rate of trans- 
formation of energy into heat but not 
the total amount of energy. 

Formula (13.7.2) can also be obtained 
by considering the process of current 
buildup in an inductance. Indeed, the 
power of the current (that is, work per 
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unit time) is equal to cp /. This work is 
done by external sources of voltage and 
goes to increase the inductance energy 
W: 

(13.7.3) 



Figure 13.7.1 


Using the fact that cp = L (dj/dt), we 
find, from (13.7.3), that 


dW r . dj 1 r djn 

dt 1 dt ' 2 dt * 


(13.7.4) 


We will assume that / = 0 and W = 0 
at t = 0, and j = ; 0 and W = W 0 at 
t = £ 0 . Then, integrating (13.7.4) from 
t — 0 to t = £ 0 , we obtain 

w^=4- 

To be specific, imagine the circuit in 
Figure 13.5.2 (cp == cp^) and carry out 
the detailed computations for buildup 
of the energy of inductance. In the 
steady-state mode, when the current has 
reached a constant value / 0 , the potential 
<p A is zero and the energy of inductance 
does not vary, but the source of emf 
needed to maintain the constant current 
j o must continue to generate energy, 
which is released as heat on resistance R. 

The energy W of inductance is pro- 
portional to the square of the current, 
which is to say, it is proportional to the 
square of the rate of motion of the elec- 
trons. Therefore, externally, W re- 
sembles kinetic energy. But is W the 
kinetic energy of the electrons? 

Let us consider the orders of magnitude 
of W and the electron energy. Using 
a copper wire of length 100 m and dia- 
meter 0.35 mm (cross-sectional area 
equal to approximately 10 " 7 m 2 ), we 
can wind a coil having an inductance of 
0.02 henry. A current of 1 A flowing in 
this coil will release W = 0.02 x 
l 2 x 0.5 — 10 -2 J. We will now find 
the kinetic energy of the electrons. 

We will assume that for each atom of 
copper there is one electron carrying 
current (the conduction electron). The 
atomic weight of copper is about 63, so 
63 grams of copper contain 6 x 10 23 


conduction electrons, or roughly 10 22 
electrons per gram, or 10 25 electrons per 
kilogram. Copper has a density of about 
8 g/cm 3 = 8 x 10 3 kg/m 3 , and so one 
cubic meter of copper wire contains 
roughly n = 8 x 10 28 conduction elec- 
trons. Imagine a piece of copper wire 
of length v dt and a cross-sectional area 
S to the left of cross section O (Figure 
13.7.1). If the electron velocity is v , 
then through S there will flow Snv dt 
electrons in time d£. 13 - 9 In time dt the 
electrons at cross section A will move 
to cross section 0, which means that 
during this time all the electrons in the 
volume between O and A will pass 
through O. That is, they will pass 
through a cylinder of altitude v dt and 
base S. 

Denote by e the electron charge in 
coulombs, e = — 1.6 X 10~ 19 C. The 
quantity of electricity which the Snv dt 
electrons transfer in time dt is equal to 
the current in amperes multiplied by the 
time dt. Therefore, Snvedt= ] dt, whence 
j = Snve, or y = j/Sne. Substituting 
7 = 1 A, S = 10 -7 m 2 , 8 x 10 28 m -3 , 

e= — 1.6 x 10 -19 C, we obtain 

v = ■ ~ 8 x 1 0" 4 

10’ 7 x8xl0 28 xl.6xl0- 19 * 

The dimensions are that of velocity: 
[v] = A/m 2 -m _3 *C = A*m/A*s = m/s. 

Now we have to calculate the kinetic 
energy of the (conduction) electrons. 
The electron mass m is 9 X 10 ~ 31 kg. 
The total number of electrons moving 
in the wire ( S = 10 " 7 m 2 and l = 
10 2 m) is 10 2 x 10 " 7 x 8 x 10 28 ^ 


1 . 3.9 \y e have in view the mean velocity of 
their motion in the direction of current flow 
and not the velocity of random thermal mo- 
tion. 
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10 24 . The kinetic energy of one electron 
is 

mv 2 ~ — X 9 x 10 -31 x 64 X 10" 8 
~3x 10" 37 J, 

while the total kinetic energy of the 
electrons moving in the wire is 

r=rc-^-~io 24 x3x 10- 37 
= 3 x 10- 13 J. 

Thus, the kinetic energy of the elec- 
trons constitutes a minute fraction of 
the inductance, although it depends on 
the current via the same law (it is pro- 
portional to / 2 ) as the inductance ener- 
gy. Physically, the inductance energy 
is the energy of the magnetic field which 
appears in the coil when current flows 
through it. 

Let us point out some similarities and 
differences between capacitance and in- 
ductance. Both can serve as reservoirs of 
energy, both can be used to accumulate 
electric energy from a weak primary 
source and then release it quickly at 
the required place and time. 

A capacitor can be charged with a 
small current j 1 during a long time t x ; 
then rapidly discharging it through a 
small resistance during a short time t 2i 
we can obtain a large current / 2 ~ 
fitjtz, and the potential difference across 
the capacitor does not exceed the emf 
of the primary source. In other words, 
a capacitor enables us to increase the 
current but not the voltage. 

We can send a large current through 
an inductance under a small voltage 
(small emf) E 0 of the primary source. 
The only requirement here is that the 
resistances of the inductance and the 
primary source be small. Then it takes 
a comparatively long time t 3 for a large 
current to build up in the inductance. 
When the inductance is shunted by a 
high resistance, we can obtain a large 
potential difference cp for a short time 
i 4 , with cp ~ E 0 t 3 /t 4 . An inductance 
enables us to increase the voltage but 
not the current. 


The essential practical difference be- 
tween a capacitance and an inductance is 
that a capacitor disconnected from a 
voltage source can retain its supply of 
energy for a very long time — hours or 
even days. The discharge time of a ca- 
pacitor is equal to RC, where C is the 
capacitance and R the leakage resis- 
tance. Using good insulators, we can 
obtain enormous values of i?, that is, 
long discharge times. An inductance in 
the form of a coil and short-circuited 
(minimal resistance) retains its energy 
(if current is flowing) for only a fraction 
of a second. 

The decay time of the current in an 
inductance is of the order of L/R, but 
even with the best conductors (copper, 
silver) it is impossible to make L/R 
greater than a few seconds for an ordi- 
nary laboratory-type coil. It will be 
noted that if we increase the number of 
turns in the coil for a given volume by 
using thinner wire, L will increase, but 
so will i?, their ratio, however, to within 
order of magnitude, does not change. 
Therefore, under laboratory conditions, 
inductance is conveniently used for 
increasing voltage but not for long- 
term storage of energy. 

Circuits involving capacitances and 
inductances can be used to accumulate 
energy from, say, a flashlight battery 
which with an internal resistance of 
several ohms yields a few volts so that 
the maximum power output is of the 
order of one to two watts. Using circuits 
of this kind, we are able to obtain pow- 
ers up to hundreds of kilowatts. But 
a power output of this kind lasts for a 
time interval of the order of 10~ 6 s. 

It has been noted that the electric 
energy in an inductance is quickly con- 
verted into heat due to resistance. This 
assertion holds true for coils of the 
ordinary laboratory kind and at ordi- 
nary (normal) temperatures. In two 
extreme cases, however, this does not 
hold true. 

(1) At very low temperatures of the 
order of — 260 °C down to absolute zero 
( — 273 °C), many metals (for instance, 
lead, mercury, but not copper) pass in- 
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to what is known as the su perconducting 
state. Their specific resistance (resistiv- 
ity) becomes exactly equal to zero. 

The Dutch scientist Heike Kamer- 
lingh Onnes (1853-1926), who discov- 
ered this phenomenon in 1911, observed 
a constant current in a ring circuit of 
superconducting material that lasted 
many days without any decrease in in- 
tensity. The presence of current in such 
a ring circuit is detected via the magnet- 
ic field of the current. 

The practical application of super- 
conductors is limited not only by the 
difficulty of generating low tempera- 
tures. A strong magnetic field converts a 
superconductor to the normal state 
(with finite resistance). This is why large 
currents cannot be transmitted through 
a superconductor. 1310 

(2) The relationship between induct- 
ance and resistance and the conditions 
of current decay vary drastically when 
all the dimensions of the coil are increased , 
particularly when passing to astro- 
nomical phenomena (on the astronomi- 
cal scale). 1311 

Picture two geometrically similar 
coils, one of which is n times the other in 
size, the number of turns in both being 
the same. In the large coil, the diame- 
ter of the coil is n times greater, but 
so is the height of the coil and the diam- 
eter of the wire. Suppose the coils 
are made of the same material. Quan- 
tities relating to the small coil will be 
labeled with the subscript 1, those refer- 
ring to the large coil will have the sub- 
script 2. Let us calculate the relation 


i3 . 10 j n 1901 an a iioy was discovered of the 
rare element niobium and tin in which a cur- 
rent up to 100 000 A/cm 2 and a magnetic 
field up to 25 T (that is, 250 000 gauss) 
were not able to destroy superconductivity. 
At present superconducting magnets are wide- 
ly used in science and technology, for instance, 
in particle accelerators. There is reason to 
believe that in the near future superconducti- 
vity will be employed in electric transmis- 
sion lines. 

13 - 11 Compare with Exercise 13.2.3; for 
a circuit consisting of a capacitance and a resis- 
tance the discharge time does not change 
when all dimensions are altered. 


between the resistances of the coils: 

R\ = pl 1 /S 1 , R 2 = pills?,! 

where p is the resistivity (specific re- 
sistance) of the coil material, l is the 
length of the wire, and S the cross- 
sectional area of the wire. Geometrical- 
ly, it is clear that l 2 == nl x and S 2 = 
n 2 S 1 and, hence, that R 2 — RJn. The 
resistance is inversely proportional to 
n , that is, to the dimensions. 

It can be proved that the inductance 
of the large coil is exactly n times the 
inductance of the small coil, L 2 = 
nL x , which means that increasing the 
linear dimensions of the coil n times 
increases the inductance n times, too. 
The decay time of the current, t, is of 
the order of L!R\ consequently, % 1 = 
L 1 /R 1 and t 2 = L 2 IR 2 = n 2 L 1 /R 1 = 
n 2 t x . 

Thus, the decay time of the current 
is proportional to the square of the di- 
mensions. If the earth were to consist 
of copper, the decay time of a current 
flowing in it would be of the order of 
10 15 to 10 16 s, or approximately 10 8 
years. 

The conductivity of ionized gases is 
of the same order of magnitude as that 
of copper. For this reason, the decay 
time of a current in astronomical phe- 
nomena is enormous. This means that 
resistance and Ohm’s law do not play 
any role whatsoever in these phenome- 
na. Recall Figure 13.5.5: the current 
on the initial portion of any curve de- 
pends on L alone but not on i?, and in 
astronomy we are always on the ’’ini- 
tial portion.” 

Terrestrial magnetism is due to the 
magnetic field of the current flowing 
in the viscous molten mass of the cen- 
tral core of the earth. The slow motions 
of this molten mass in the magnetic 
field sustain the currents, just as the 
motion of the armature in a dynamo in 
a magnetic field sustains the current 
in the armature winding and in the 
winding of the electromagnet. The 
same is true of the magnetic field of the 
sun. The theory successfully predicts 
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the periodic changes in the direction of 
the sun’s field (the half-period is 11 
years). 


13.8 The Oscillatory Circuit 

Let us consider a circuit consisting of 
a capacitance C and an inductance L 
(Figure 13.8.1). Let point B of the cir- 
cuit be grounded. By formula (13.1.11), 
<Pc + cp L = 0, where cp L = L (dj/dt) 
and cp c = q!C. The voltage drop on the 
capacitance, cp c , will be denoted simply 
(p. Then 

(P + lJ- = 0. (13.8.1) 

Note that / = dqldt and so djldt = 
d^qldt 2 . But since d^qldt 2 = C (< i\!dt 2 ), 
it follows that, using formula (13.8.1), 
we find that 


<p + LC-^ = 0, or 


dh p 
dt 2 



(13.8.2) 


We considered a similar equation in 
Chapter 10 in the study of mechanical 
vibrations. It was established that the 
functions (p = A sin (dt and cp = 
B cos (dt are solutions to Eq. (13.8.2) 
for arbitrary A and B and a suitably 
chosen co. Let us verify this, say, for 
cp = A sin (dt and in passing we will 
define co. Substituting cp and d\!dt 2 
into (13.8.2), we get 

— ALC(o 2 sin (dt = — A sin co£, 


or, canceling out — A sin co t, we get 
LC co 2 = 1, whence 

"■W- (13 ' 8 ' 3) 

Consequently, for a solution to 
Eq. (13.8.2) we have functions that 
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describe oscillations with a circular fre- 
quency 1 /]/" LC . The oscillation period is 

T = ~ = 2nYLC. (13.8.4) 

Let us check the dimensions in 
(13.8.4): 

1C] = F = C/V = A-s/V, 

[L] = H = V/(A/s) = V-s/A, 

so that [LC] = s 2 and T cc Y LC does 
indeed have the dimensions of time, s. 

Let us examine in detail the solution 
to Eq. (13.8.2). The solutions cp = 
A sin co t and <p = B cos (at are actual- 
ly indistinguishable since the sine 
curve is obtained from the cosine curve 
by a shift along the t axis; the two can 
be combined into a single expression: 

cp = B cos ((at + a) (13.8.5) 

(cf. Section 10.2). The amplitude B of 
the oscillations and the initial phase a 
(in Section 10.2 we denoted this quan- 
tity by cp) may be arbitrary. For a 
given cp (t) we find the dependence of 
current on time: 

j = C 4jj^= — CB(a sin (cot + a). (13.8.6) 

Let us find the energy of capacitance 
and the energy of inductance: 

W c = = £f- cos* (at + a), 

55i.i Q . (w , + a ) . 

Substituting the expression (13.8.3) for 
co yields 

W L = sin 2 (cof-f-a). 

The total energy is independent of 
time, as was to be expected. Indeed, 

P= W c W L = [cos 2 (cot-fa) 

+ sin 2 (cot-fa)] =-^i. 

To summarize, the motion of charges 
in an LC circuit is similar to the mo- 
tion of a mass attached to a spring. 
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The energy of a charged capacitor may 
be likened to the elastic energy of a 
spring, which is at a maximum when 
the mass is in the extreme position of 
maximum separation from the equilib- 
rium position. The energy of an induc- 
tance may be likened to the kinetic 
energy of a moving mass. When the 
charge on a capacitance is equal to ze- 
ro, the current reaches its maximum 
(in absolute value), at this instant the 
capacitance energy is zero and the in- 
ductance energy is equal to the total 
energy (cos 2 (co£ + a) = 0 and 
sin 2 (cd£ + a) = 1). This is exactly 
what happens in the oscillations of a 
mass on a spring: when the mass passes 
through the position of equilibrium, 
the potential energy is zero and the 
kinetic energy is equal to the total ener- 
gy of the oscillations. 

Let us use the term general problem 
for the problem of finding the potential 
in a circuit provided that at the ini- 
tial time t = 0 we have j = j 0 and 
cp = cp 0 . Neither the particular solution 
<p = A sin out nor the particular solu- 
tion cp — B cos (x)t enables us to solve 
the general problem. To solve the gen- 
eral problem we will need the general 
solution (13.8.5), with the two (inde- 
terminate) parameters B and a at our 
disposal. 

Setting t = 0 in (13.8.5) and (13.8.6) 
yields 

cp (0) — B cos a = (p 0 , 

/ (0) = — CB 0 sin a - 7 0 , 

whence we find the solution with 
given cp 0 and j 0 : 

B = Vvl + ti/(C<») 2 , 

tana^ — ; 0 /C(ocp 0 . (13.8.7) 

We already know that for such oscilla- 
tions the energy is conserved and, hence, 
is equal to the initial energy C^\12 + 
Lj 2 J 2. The expression for the energy 
may also be written as C5 2 /2, which 
enables writing the first formula in 
(13.8.7) as 

B = ^=V^l + (LIC)jl, (13.8.8) 



Figure 13.8.2 

whence (as well as from (13.8.6)) we 
also obtain 

jm*x = Vn+(C/L)<?l (13.8.9) 

(note that to = Y 1/LC , in view of 
(13.8.4)). 

The circuit diagram for generating 
oscillations is shown in Figure 13.8.2. 
Here we have a source of voltage with 
emf E 0 . If we close A and leave B 
open, after a lapse of time x BC 
after closing the circuit the capacitance 
will be charged to potential E 0 . Open 
A and at time t = 0 close B. Then oscil- 
lations in the LC circuit will set in 
with cp = (p 0 = E 0 and 7=0 at t = 0. 
Note that with these oscillations, the 
potential difference across the electrodes 
of the open switch A will vary pe- 
riodically from 0 to 2 E 0 . 

There is another way of setting up 
oscillations in the circuit of Fig- 
ure 13.8.2. First close both switches A 
and B. The current flowing in the cir- 
cuit will be 7 0 = E 0 /B. At t = 0 open 
A. Then oscillations in the LC circuit 
will set in with cp 0 = 0 and j 0 = EJB 
at t = 0. For these oscillations, the 
maximum amplitude of the potential 
will reach 

y ma x = ioVuc=Eo^-Vuc. 

It will be recalled that in a circuit 
without capacitance, when we break 
the circuit containing inductance L, 
"hie q/dieitcfcli liffrerenue irevtbro'peh on 
the switch is the larger the greater the 
resistance of the air gap between the 
electrodes of the switch. When such a 
circuit (with no capacitance) is broken, 
there is always a discharge in the 
air gap between the contacts of the 



13.9 Damped Oscillations 


441 



Figure 13.8.3 

switch. In the case of a capacitance, 
the maximum potential difference be- 
tween the electrodes of switch A does not 
exceed a definite value, E 0 + cp max . If 
this value is less than what is required 
to initiate the discharge in the air gap 
of the switch, there will be no discharge. 
We say that the capacitance C ex- 
tinguishes the discharge when an induct- 
ance circuit is broken. Note that the 
quantity R~ X Y L/C may be greater 
than unity. Then by opening switch B 
a quarter-period after opening A, we 
obtain a potential on capacitance C 
that is higher than the potential on the 
voltage source, E 0 . 

Exercises 

13.8.1. Consider the variation of potential 
with time in the circuit shown in Figure 13.8.3. 
Determine the greatest value of cp and the time 
required to attain it. Assume that switch A 
is closed at time t = 0. 

13.8.2. In the preceding problem, find the 
energy of capacitance and the energy released 
by the voltage source when cp is at a maximum. 

13.9 Damped Oscillations 

Let us consider a circuit with a resist- 
ance j R in series with an inductance L 
(Figure 13.9.1). We assume R to be 
small. If R is not taken into account at 
all, we get the circuit diagram of Fig- 
ure 13.8.1. If cp = cp 0 and 7 = 0 at 
t — 0, then, by (13.8.7), we have 

cp = cp o cos cd£, / = / m sin (co£ + ji) = 
—7m sin (x)t , (13.9.1) 

with 

7m — /max “ ® " r — • (13.9.2) 

\ LiL 

The total energy P is then Ci p®/2 or, 
using (13.9.2), 

P = -^ l - , (13.9.3) 



Figure 13.9.1 

When a resistance is present in the 
circuit, electric energy is converted in- 
to thermal energy. The power output of 
the resistance, h, is given by the fol- 
lowing formula: 

h = Rj 2 = Rjm sin 2 a>t 

= (l-cos2©f)- (13.9.4) 

The power output in the case of ele- 
ctric oscillations does not remain con- 
stant: over each period (cycle), h twice 
reaches a maximum and twice becomes 
zero (the sign of course does not change, 
that is, the power output is always 
positive). Let us find the average value 
of h over one period. From (13.9.4) we 
find that 

h = ~ 2 - (1-cos 2 at). 

Recalling that the average value of the 
cosine over one period is zero, we ob- 
tain 



Heat release on resistance R can occur 
only as a result of reduction in the elec- 
tric energy P. Therefore 

§-=-’>■ (13.9.5) 

We assumed that R was small, and 
so h is small. The oscillation energy 
falls off slowly and an appreciable 
change in energy becomes noticeable only 
after several cycles. Considering time 
intervals that are large compared with 
the oscillation period T, we replace h 
by h in the right-hand side of (13.9.5): 

(13.9.6) 

Since the energy P varies slowly, from 
(13.9.3) we see that j m too is a slowly 
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varying quantity. -Expressing j m from 
{13.9.3), we get 

j m = yWL. (13.9.7) 

Using this, from (13.9.6) we get dPldt = 
— C RiL) P (here ~ was replaced with = ). 
The solution to this equation is 

P = P 0 e-WDt, 

where P 0 is the value of P at t = 0. 
Therefore, according to (13.9.7), J m — 
yWJLe-Rtlu. Then 

j = Y ZPJL e~ Rtf2L sin (co t + n). (13.9.8) 

Recalling that cp = cp 0 cos c ot and cp 0 = 
j m /C co, we get 

<p=-^cosat 

= ~^\ /t e ~ Rt/2L cos ©*• (13.9.9) 

Formulas (13.9.8) and (13.9.9) show 
that if a small resistance is present, the 
electric oscillations damp out via an 
exponential law. 

The solution given above was obtained 
by means of an approximate calcu- 
lation. Note that in this approximate 
solution the relation j = C (dcp/dt) is 
not satisfied, although it holds the more 
exactly the smaller R is. Let us now 
try to solve the problem exactly. For 
the circuit shown in Figure 13.9.1 we 
have cp + cpjR + = 0, whence 

V + RJ + L- 1 = 0, (13.9.10) 

and j = C {dq/dt). Substituting the 
expression for j and d]ldt into (13.9.10), 
we find that 

LC T$- = -i’- RC i h < 13 - 9J1 > 

We will seek the solution to this 
equation in the form obtained in the 
approximate consideration, or 

<p = Ae~ u cos at , (13.9.12) 

where A, co, and A are constants that 
we have to determine. The procedure 
that follows agrees completely with the 
one encountered in the theory of oscil- 


lations (see Section 10.4, and compare 
the solution (13.9.12) of Eq. (13.9.11) 
with formula (10.4.7)), so that here we 
could simply refer to the results ob- 
tained in Section 10.4. However, we will 
not complicate matters by referring the 
reader to Chapter 10 to compare the 
different notations, and will briefly 
repeat the required derivation. 

We put the expression (13.9.12) for cp 
and the expressions for the first and sec- 
ond derivatives of cp that follow from 
(13.9.12) into Eq. (13.9.11) and cancel 
the common factor Ae~ Kt out of all 
terms to get 

LCX 2 cos (tit + 2LCk(ti sin (tit 
— LC co 2 cos (tit = — cos (tit 
+ RC'k cos (tit + RC(ti sin (tit. 

For this equation to be valid for arbit- 
rary t, it is necessary that the coefficients 
of cos (tit and sin (tit be equal sepa- 
rately on the right and on the left: 

LCV - LC(ti 2 = RCX - 1, (13.9.13) 

2LCl(ti = RC(ti. (13.9.14) 

The condition (13.9.14) yields the val- 
ue of A, and by substituting this value 
into (13.9.13) we get co: 

* = -&• I 13 - 9 ' 15 * 

The constant A in (13.9.12) has yet 
to be determined. To do this we must 
specify the initial condition (say, (p — 
cpo at t — 0). Finally, knowing cp (£), 
we can easily find j = C (d(p/dt). We 
then have 

] = — CAe~te s i n co^ A cos co£). 

(13.9.16) 

Comparing the exact solution with 
the approximate, we note the follow- 
ing: (1) in the approximate considera- 
tion of the problem we correctly de- 
termined the number A, which describes 
the rate of decay of the oscillations 
(however, the approximate solution 
does not yield the dependence of fre- 
quency co on resistance R), and (2) the 
formula for current is somewhat differ- 
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Figure 13.9.2 

ent from the one that was obtained in 
approximate fashion. 

In exactly the same manner we can 
show that Eq. (13.9.11) has yet another 
solution, 

<p = Be~ u sin co£, (13.9.17) 

where co and % are the same as in 
(13.9.15). The corresponding current is 

/ = CBe~ u (co cos c ot — X sin co£). 

(13.9.18) 

The sum cp = e~ Xt (A cos co t + 

B sin co£) of the solutions (13.9.12) and 
{13.9.17) is also a solution to Eq. 
{13.9.11). It is only with the aid of 
this sum that we can solve the general 
problem: to find the solution to Eq. 
{13.9.11) with the initial conditions 
<p = cp 0 and j = j 0 at t = 0. Indeed, 
for the coefficients A and B we then 
have the equations cp 0 = A and j 0 = 
— CAh + C5cd, whence A = cp 0 and 
B = (CXa p 0 + j 0 )/C(d . 

Exercises 

13.9.1. Find j ( t ) for the circuit of Fig- 
ure 13.9.1 if C = 1, L = 1, and R = 0.1, 0.5, 
1. Assume that cp = 1 , and / = 0 at t = 0. 

13.9.2. The same question if cp = 0 and 
j — 0 at t = 0. 

13.9.3. Using the approximate method, 
find the rate of decay X of oscillations in the 
circuit shown in Figure 13.9.2 on the assump- 
tion that R is very great. 

13.10* The Case of a Large Resistance 

The. casr, r/msidereil. hem oi_ a largp re- 
sistance is mainly of mathematical 
interest and is not connected with the 
sequel. It may therefore be skipped in a 
first reading. 

The solution of Eq. (13.9.11) ob- 
tained in the preceding section is valid 


— Wv — 
R 
L 

onrm 


only for B that are not too large. In- 
deed, from (13.9.15) it is evident that 
if B is greater than 2 Y L/C, the formula 
we have for co is meaningless since the 
radicand is negative. In that case 
Eq. (13.9.11) has a different kind of 
solution. We will seek the solution in 
the form cp = Ae~ 0* (and accordingly, 
j = — AC$e-$ l ). Substituting into 
(13.9.11) the expressions for cp and its 
derivatives and canceling Ae~ 3* ; out 
of all terms, we get LCp 2 = —1 + 
This is a quadratic equation in 
p. Solving it, we find that 


* = -k±V§?-Tc- “ 8 ' 10J » 


The radicand in (13.10.1) differs in sign 
from the radicand in (13.9.15). Hence, 
in those cases where it is impossible to 
find co we can find p, and vice versa. 
Formula (13.10.1) yields two distinct 
values of P, and so we can set up two 
solutions to Eq. (13.9.11), cp = Ae~ Pd 
and cp = Be~^K Their sum is also a 
solution: 


cp = Ae~^ t Be~^K (13.10.2) 

Accordingly 

/= -AC^e-M-BC^e-M. ( 13 . 10 . 3 ) 


If cp — cp 0 and j = j 0 at t = 0, then, 
assuming that t = 0 in (13.10.2) and 
(13.10.3), we get A + B = cp 0 and 
—AC p! — BC P 2 = 7o- We can find A 
and B from this system of equations. 

Let us consider in more detail the 
expression for p. Let B > 2]/ L/C. 
Then 



1 

LC 


= JL 1/' 

2 L V ' 


4 L 
R 2 C 


can be expanded by the binomial theo- 
mm Sncdinm WA confine ojiv - 
selves to two terms: 



i? 2 C — 2L \ 


R. 1_ 

2 L RC' 



4 L 
R 2 C 


) 
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Therefore 

o R , R \ R l 1 i 

P 1 ~ 2L 2L RC~ L RC ~ L 

since R is great, similarly, 

n ~ _R R_ _J_ _ J_ 

P2 “2 L 2 L ^ RC “ RC * 

These values of and p 2 are famil- 
iar from Sections 13.1 to 13.5. Indeed, 
P x corresponds to current decay by the 
law £-(#/£)*, which means that this is 
an RL circuit (see Section 13.5). The 
second root p 2 corresponds to current 
decay by the law ? which means 

that this is an RC circuit (see Sec- 
tion 13.2). 

Of mathematical interest is the particular 
case where the radicand in (13.10.1) is exactly 
zero: 

i? 2 1 

4 L 2 ~ LC ’ 

so that both roots, P x and P 2 , coincide. We 
obtain only one solution to Eq. (13.9.11). 
But in order to solve the problem with initial 
condition q> = q> 0 and j = / 0 at t — 0 we 
need two solutions. 

How can we find the second solution? Sup- 
pose that P x =4= p 2 but the difference fh — p 2 
is small. Then we have two solutions: e~® lt 
and e~~^ zt . Their difference is also a solution. 
We write this solution as 

Since p 2 — iPi is small, it follows that 
)t 1 -|- (p 2 — (in the Taylor series 
only two terms are retained), whence 

-e-V 2t ~ e-^t (p 2 — pi). 

The last expression suggests that if p 2 = fh = 
P, the second solution must be taken in the 
form cp = Bte~^ 1 . Substituting this cp into 
Eq. (13.9.11) and noting that p = R/2L, 
we see that the equation is indeed satisfied. 
Thus, when P x = P 2 = P, we must take cp 
in the form 

y = Ae~$ t + Bte~$ t . 

This (p (and the corresponding/) permits solv- 
ing the problem with arbitrary initial cp 0 
and / 0 . 

Exercises 

13.10.1. Find cp (t) for L= 1, C = 1, 
and R = 2, 6, 10. Assume that cp 0 = 1 and 
Jo = 0 at t = 0. 


13.10.2. Find cp (t) for L = 1, C= 1, 
and R = 2, 4, provided that cp^ = 1 and /n = 1 
at t = 0. 

13.11 Alternating Current 

We will now examine circuits in which 
the voltage source has an emf that 
varies periodically with time with a de- 
finite given frequency co. These pro- 
blems are very important in radio-cir- 
cuit work. The frequency of alternating 
current exerts quite a different effect 
on the passage of current through an in- 
ductance and a capacitance. The higher 
the frequency, the faster the current 
varies and the “harder” it is for the 
current to pass through an inductance 
and the greater the potential difference 
that a current of a given intensity can 
set up. Contrariwise, the potential differ- 
ence of the plates of a capacitor is the 
smaller the greater the frequency. When 
the frequency is increased, the period 
diminishes and, consequently, the time 
interval decreases during which the 
current flowing in one direction can 
charge up the capacitor. Therefore, as 
the frequency increases, the charge on 
the capacitor decreases and so does the 
potential difference on the plates of 
the capacitor. 

We have already pointed out (see Sec- 
tion 13.8) that the movement of charges 
in an LC circuit may be likened to 
the oscillations of a body suspended 
from a spring: whereas in the case of 
vibrating body the distance of the body 
from the origin and also its velocity 
vary periodically with time, in a cir- 
cuit the potential and current vary pe- 
riodically. 

The frequency with which a body vi- 
brates under the action of the elastic 
force of the spring (in the absence of 
any other forces) is called the natural 
frequency. Similarly, the frequency of 
oscillation of potential in an LC circuit 
is termed the natural frequency of the 
circuit. 

Developing this analogy further, we 
can assume that if the circuit is con- 
nected to an alternating current network 
(which means the potential impressed 
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on the circuit will vary periodically), 
we will have what is known as reso- 
nance. What resonance means is that 
the amplitude of oscillation is at a max- 
imum when the frequency co of the 
current is equal to the natural fre- 
quency cd 0 of the circuit. The amplitude 
increases sharply when the difference 
co — co 0 approaches zero. Resonance 
actually does take place and we will 
consider it in Section 13.13. 

For every two-terminal network (see 
Figure 13.1.7) connected to an alter- 
nating-current circuit, there is a de- 
finite relationship between the poten- 
tial difference and the current. Let us 
find this relationship first for the simpl- 
est case of separated elements R, L 
and C and then, in Sections 13.13 and 
13.14 for more complicated circuit. 

We will consider alternating current 
of a definite frequency co; as before, co 
is connected with the period through 
the relation co = 2 n/T. For example, 
in the USSR the standard current is 
50 cycles per second (or 50 Hz), that is, 
T = 1/50 s and (o = 2n x 50 ^ 
314 s- 1 . 

We refer to the circuit diagram in 
Figure 13.11.1, which contains an am- 
meter A that indicates current / and a 
voltmeter V measuring the voltage 
(difference of potentials). Suppose that 
the ammeter and voltmeter are so iner- 
tialess (high-speed) that they permit 
measuring the instantaneous value of 
current at each instant and, hence, their 
readings vary with a period equal to 
that of the current. This experiment is 
usually accomplished with the aid of an 
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oscillograph (a so-called loop oscillo- 
graph with two loops or a cathode-ray 
oscillograph with two beams). The pos- 
itive direction of current is shown by 
an arrow. The voltmeter V measures 
cp = cp A — cp B . By closing one or an- 
other of the switches we can investigate 
the current flowing through the re- 
sistance, inductance, or capacitance. 

Suppose the current varies with time 
in accordance with the law 

7 = 7o cos (<*>* + a). (13.11.1) 

If this current flows through resistance 
i?, by Ohm’s law we have 

cp^ = Rj = Rj 0 cos (co£ + a). (13.11.2) 

For the sake of generality, let us write 
this as 

<Pr (0 = <Pi cos (<*>* + a i)> 

where c p x = Rj 0 and a L = a. 

Let the current (13.11.1) flow through 
inductance L. Then 

<P l = L~- =— Lco/oSin (co£ + a). 

Set 

<Pl = <P 2 cos ((x)t + a 2 ). (13.11.3) 

Then cp 2 = Lco/ 0 and a 2 = a + n/2. 
Indeed, we know that cos (P + n/2) = 
— sin p for arbitrary P and so 

cos (co£ + a + n/2) = —sin (co£ + cc). 

Thus, in the case of alternating cur- 
rent the relationship between the am- 
plitude of current, / 0 , and the amplitude 
of voltage, cp 2 , in the inductance is the 
same as in a resistance equal to R 2 = 
L co. If L is expressed in henrys (H) 
and co in reciprocal seconds (s _1 ), 
then R 2 will be expressed in ohms (Q). 

Inductance differs from resistance in 
that the curve of the voltage is displaced 
a quarter-period from the current 
curve (Figure 13.11.2). This is quite 
evident from the formula 

—sin (co£ + a) = cos (co£ + a + n/2) 
= cos [co (t + n/2co) + a]. 
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Figure 13.11.2 

Let the function cos (co£ + a), to 
which the current is proportional, 
reach a definite value at time tp. 

cos (co^ + a) = a. 

The function cos (co£ + a + Jt/2), to 
which the voltage on the inductance is 
proportional, reaches the same value 
at a different time £ 2 , so that 

cos (cd £ 2 + a + n/ 2 ) = a 
= cos ((ot x + a). 

Therefore, co£ 2 + Jt/2 = co^, whence 
£2 = h — k/2co = t x — 774, which 
means the voltage leads the current 
by one quarter of a period. 

Quite naturally, we can add any in- 
tegral number of periods to £ x and write 
t 2 = h — Z74 + T = £1 + (3/4) T or 
£ 2 — + (7/4) 77 The formula indi- 
cates the smallest (in absolute value) 
time shift that carries the current curve 
into the voltage curve. 

Let us consider the case of capaci- 
tance. Here, 7 = C (dcp c /d£) and so 

1 f 1 f 

<p c = — \ j dt == — \ 7 0 cos (oof + a) dt 
= sin (co£-|-a) (13.11.4) 

(the constant of integration is equal to 
<p c , but for alternating current it is 
always true that cp c = 0). Writing cp c 
as cp c = cp 3 cos (co£ + o^), we get cp 3 = 
(1/Cco) 7 0 and a 3 = a — jt/2. 

Thus, in an alternating-current cir- 
cuit the relationship between the am- 
plitude of current and the amplitude of 
voltage is the same on a capacitance as 
on a resistance equal to R 3 = l/Cco. 
Expressing capacitance in farads (F) and 



frequency in reciprocal seconds, we ob- 
tain R 3 in ohms. 

In a capacitance, the voltage curve 
is shifted forward with respect to the 
current curve by a quarter-period (Fig- 
ure 13.11.3). Thus, the curve of vol- 
tage cp c in a capacitance is shifted in 
the opposite direction to the curve of 
voltage cp L in an inductance. 

For a given current, cp L and cp c are 
of opposite sign. If the curves of cp L 
and cp c are brought to coincidence, we 
will see that the current flowing through 
the capacitance and the current flowing 
through the inductance have opposite 
signs. Indeed, all formulas expressing cp 
as a function of t can readily be trans- 
formed into formulas expressing 7 as 
a function of £: 

cp = cp 0 cos (c ot + a), 

cp H = Rj 0 cos (cd£ + oc), 

cp L = Lcd/ 0 cos (a )£ + a + Jt/2) 

= — Leo/ o sin (co£ + a), 

<Pc = 7^7 7° cos (<i>t + a — -j - ) 

= 7o siD + a ) 
and 

7 = 7 ' 0 cos(u)f + a), j R = cos (at -f- a) » 
7 l =-£-cos(a>i + a-^-) 

= -^- sin (<at + a), 

7 C = cp 0 Cco cos [at + a + j 

= — cpoCco sin (c ot -f- a). 

The opposite phase shift and the oppo- 
site signs in the formulas referring to 
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inductance and capacitance are of cru- 13.12 Average Quantities, 
cial importance when considering L and Power and Phase Shift 
C connected in one circuit. T < i .. A . 

In alternating-current experiments, In the preceding section the current 
one frequently makes use of a single- and voltage were regarded as functions 
beam cathode-ray oscillograph. A voltage ime ' H° wever ’ in many cases it 

proportional to the current is impressed su fices know the average (constant} 
on one pair of deflection plates (deflec- ™ lues of these quantities (cf. Sec- 
tion along thex axis) and a voltage pro- tl0 f 

portional to cp is applied to the other As an elementary case, let us consid- 

pair of deflection plates (deflection along a Seating device with resistance R. 

the y axis). The beam moves along a . e know that in a direct-current (dc) 
line whose equation has the form x = circuit the power output (that is, the* 
aj and y = 6cp, where the coefficients a Quantity of energy released per unit of 
and b depend on the sensitivity of the time ) 1S h = 9 / = Rj == cp IR. In an 
oscillograph. Since 7 and cp are period- alternating-current (ac) circuit the pow- 
ic functions of time, the beam sweeps er ou tput varies, that is, the (instan- 
out the same curve on the screen all the taneous) power output at time t is= 


time. At 50 cycles per second, the hu- 
man eye cannot detect any motion of 
the ray and sees a solid luminous curve. 

If the potential difference from the 
resistance , cp#, is impressed on the ver- 
tical deflection plates of the oscillo- 


h(t) = cp (t) j ( t ) = Rj 2 (t) 
= cp 2 (t)/R. 

Therefore (see (13.11.1)), 


graph, me fay aescrines' a siraigm uni. n (t)~= rt]£ cos 21 (co£ 4 - a). (13112.1) 

True enough, for 


x = aj = aj 0 cos (co£ + a), 

V — = bRj 0 cos (at + a )- 

Eliminating t, we find that y = ( bRla ) x. 
If to these plates we apply the poten- 
tial difference from the capacitance , cp c , 
the result is an ellipse : 

x = aj 0 cos (c ot + a) , 
y = b-^j 0 sin (coi + a), 


Over one period, h ( t ) becomes zero- 
twice and reaches a maximum (equal 
to Rj 2 0 ) twice. When considering electric 
heaters, however, we are usually in- 
terested in the amount of heat generat- 
ed during a time interval t that is 
many times greater than the period T 
of alternating current (in the USSR 
this period is usually 0.02 s). It is there- 
fore sufficient to know the average 
value of the power output over a large 
time interval t. In view of (13.12.1), 


whence 



= cos 2 (cof + cx) 


-f- sin 2 (cd£ + a) = 1 . 


Also, an ellipse results from cp L (induct- 
ance). If we connect cp H + cp L or 
<Pn + <Pc> Ike axes of symmetry of the 
ellipse no longer coincide with the x 
and y axes. Thus, the shape of an oscil- 
logram tells us what the circuit is made 
up of (C, L, or i?), what the “innards of 
the box” consist of (see Figure 13.1.7). 


h = Rjl cos 2 (cd£ + a) = Rj\ cos 2 (co£ + a) 

(13.12.2). 

since, as we repeatedly noted in this 
book, cos 2 (cd£ + a) = 1/2 (cf. Sec- 
tion 7.8 or 10.4). Equation (13.12.2) i& 
approximate , being the more exact the 
greater t is. 

The average value of an alternating 
current , 7 , is ordinarily defined as the* 
intensity of a direct current that gener- 
ates an equivalent power output on a re - 
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distance i?: 13 12 

Rp = Rjl/ 2, (13.12.3) 

whence 

J — 0.71 /o (13.12.4) 

In the same way, the average value of 
the voltage, cp, is determined from the 
condition 


whence 

<p=- V\p 2 = — Lqpo ^ 0.71_(p 0 . (13.12.5) 

Instruments that measure alternating 
current (ammeters and voltmeters) are 
calibrated so that they give the average 
value of current or voltage, / or cp. 

From formulas (13.12.4) and (13.12.5) 
it follows that the maximum values of 
current and voltage attained in an ac 
circuit exceed the average values by a 
factor of Y 2 ~ 1.41. For example, in a 
circuit with an average voltage of 220 
volts the maximum instantaneous vol- 
tage (peak voltage) reaches ±310 volts. 

From the relations (13.12.2), (13.12.4), 
(13.12.5) and the formula cp 0 = Rj 0 it 
follows that cp = j Rj and h = cp/, so 
that Ohm’s law and the relationship 
between power, current, and voltage on 
a resistance hold true for mean values. 

When we considered alternating cur- 
rent flowing through a capacitance and 
an inductance, we saw that the cur- 

13 * 12 Clearly, if j is given by (13.11.1), the 
average value j defined by the formulas of 
Section 7.8 is zero. For this reason it is ad- 
visable to define the average value (al so cal led 
the mean value) as we do here: j = O’ 2 ) 1 / 2 (in 
mathematics this quantity is known as the 
root -me an- square of /). The square root is re- 
quired so that j should have the dimensions of 
current, while the operation of squaring makes 
it possible to avoid the mutual compensation of 
positive and negative current values. Finally, 
the sign = here means “equal by definition,” 
that is, we introduce a new definition of /, 
while j 2 is defined in the usual manner. 


rent and voltage vary along curves that 
are shifted with respect to one another, 
although the frequency is the same. Let 
us consider the power output in the gen- 
eral case of an arbitrary phase shift 
of ; with respect to cp (that is, an arbi- 
trary shift of one curve with respect to 
another). Let 

/ = 7o cos (tot + P), 
cp = cp 0 cos (tot + p + a). 

Then 

h ( t ) = / 0 cp 0 cos (cot + P) cos (tot 

+ P + oc). 

But 

cos (at + P + a) = cos (tot + P) 

X cos a — sin (tot + P) sin a, 
so that 

cos (tot + P) cos (tot + p + a) 

= cos oc cos 2 (co£ + P) 

— sin a cos (tot + P) sin (tot + P) 

== cos a cos 2 (tot + P) 

— sin a sin (2tot + 2P) . 

And since 

cos 2 (co£ ± P) = , sin (2co t + 2P) = 0, 

we have 

h = / 0 cp 0 cos a x = /cp cos a. 

Thus, the average value of the power 
output in the general case where there 
is a phase shift a is proportional to 
cos a. In the particular case of a re- 
sistance, a = 0, cos a = 1, and we re- 
turn to formula (13.12.2). 

In the case of a capacitance, a = 
— 3t/2 and cos a = 0, while in the case 
of an inductance, a = +jt/ 2 and 
cos a = 0. Hence, in both cases the 
average power output is equal to zero. 
This result is quite understandable from 
the standpoint of physics. In a capaci- 
tance and an inductance, electric ener- 
gy is not transformed into heat, it is 
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merely stored up. In an ac circuit, a 
capacitance during one half of a period 
takes electric energy from the circuit 
and stores it, only to release it back 
into the circuit during the other half. 
The same goes for inductance in an ac 
circuit. 

An ordinary transformer without any 
load is actually a pure inductance (if we 
ignore the slight losses in the wires). A 
current flows through the transformer 
with an amplitude j = (pi La. How- 
ever, as already stated, no power is 
taken from the circuit because a = jt/2 
and cos a = 0. It is an interesting fact 
that electric meters are designed to mea- 
sure just the quantity /cp cos a. For 
this reason, a nonloaded transformer 
will hardly add anything to your ele- 
ctric bill, it will only increase the total 
current flowing in the wires. 

If a large number of inductances 
(transformers, unloaded electric motors, 
and the like) are connected in parallel, 
the total current can become large, 
and then the losses in the electric wir- 
ing will be rather noticeable. This 
effect is an important factor relative to 
the electric networks of a whole city. 
The laws that govern the flow of cur- 
rent through circuits with inductances 
and capacitances show how such losses 
can be minimized. 

13.13. An Alternating-Current 
Oscillatory Circuit. Series Resonance 

Let us now consider an ac circuit com- 
prising resistance, inductance, and ca- 
pacitance in series (Figure 13.13.1). It 
is obvious that in this system the cur- 
rent flowing through R, L, and C is 
the same. We write the current in the 
form 

7 = ; o cos {at + a). (13.13.1) 

The potential difference in the series is 

-L- L C 

p0~0-o — Wv — <^nnrnrv>-|(-o--j 


Figure 13.13.1 


cp = cp x cp 4 — cp# + Wl “b <Pc- Re- 
calling the formulas (13.11.2) to 
(13.11.4), we get 

cp — jR/ 0 cos {at + a) — Laj 0 sin {at -f- a) 
+ sin {at + a) = Rj 0 cos {at + a) 

+ 7° (~ — La} sin(c ot + a). (13.13.2) 

We see from this formula that the 
potential difference on the inductance 
and that on the capacitance have differ- 
ent signs, and so the coefficient of 
sin {at + a) is the difference of two 
terms. We write cp as 

cp = b cos {at + a + P), (13.13.3) 

where b is the amplitude of the poten- 
tial difference, which is to say, the 
maximum value of potential difference 
(the peak voltage). To find fe, we rewrite 
(13.13.3) as follows: 

cp = b cos P cos (cd£ + a) 

— b sin P sin {at + a). (13.13.3a) 

Comparing this expression with 
(13.13.2), we find that 

b cos P — i?/ 0 , 

b sin p = 7 0 (Leo — . (13.13.4) 

Squaring both equations in (13.13.4), 
adding, and taking the square root of 
the sum yields 

b = j 0 y r R*+(L(o — ^) 2 . (13.13.5) 

This formula shows that for a given 
amplitude of the current, / 0 , the peak 
voltage b is at a minimum when 

Lco = ^. (13.13.6) 

Writing (13.13.5) as 

+ , 

we see that for a given peak voltage, 6, 
the peak current, / 0 , is at a maximum if 
condition (13.13.6) is met. This con- 
dition may be written thus: co = 


29-0946 
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1 iVLC, which is nothing other than 
the natural frequency of an LC circuit. 
Therefore, (13.13.6) is the condition of 
resonance , the condition of coincidence 
of the natural frequency of the circuit 
and^the xJrjvjacr frgrrjiPqcK^nf, tbr alie- 
nating current. Observe that at reso- 
nance the circuit voltage is 


cp = Rj o cos (cd£ + a). 


(13.13.7) 


Using (13.13.1), we find that at reso- 
nance 

cp = Rj. 


rent for a given peak voltage as com- 
pared with that at any nonresonance fre- 
quency 0 =£ co 0 . 

It is of interest to investigate in de- 
tail how the peak voltage and the peak 

current varv iq the .case,. at denar^ur- 

from exact resonance, that is, when 
0 =^= 1 ty LC. To do this, let us take 
advantage^ of formula (13.13.3). We 
find that cp = 6 /]/ 2 , whence b = ]^ 2 cp. 
Substituting this into (13.13.5), we get 


(13.13.8) + 


Let us now pass to average values. 
We determine the average values of 
current and potential difference in ac- 
cord with formulas (13.12.4) and 
(13.12.5). From cp L = Lcoj 0 sin ( 0 i + a) 
we obtain (see footnote 13.12) 

q> L = L(i>j — <p. (13.13.9) 

Similarly, from cp c = (1/Cro) / 0 x 
sin (at -j- a) we get 

= i-ds- < l3 - 13 - 10 > 

Formulas (13.13.7) to (13.13.10) are 
valid only in the case of resonance, that 
is, when 0 O = 1 /] f LC. Putting the val- 
ue of 0 into (13.13.9) and (13.13.10), 
we get 

1 . /"L- 

<Pjl = <Pc=-^- y c~<P- 

For this reason, when we have reso- 
nance, the voltage on the inductance 
and the capacitance is the greater the 
smaller the resistance R , and cp L and 
(p c may exceed the ac source voltage cp 
many times over. 

In a series circuit the resistances are 
additive. But the ’’resistances” of the 
capacitance and the inductance are of 
opposite sign and are different functions 
of the frequency. At resonance frequen- 
cy they are equal in absolute value 
and hence cancel each other. 

Thus, at resonance, (0 = 0 O ) an 
RLC series circuit has a minimum “re- 
sistance” and carries a maximum cur- 


But / 0 /]/2 = /, and so 
<p = 1 yi? 2 +(L©-l/C©) 2 . 


Finding 7 from this formula and put- 
ting it into (13.13.9) and (13.13.10), 
we obtain 

Leo - 

L — Vh 2 +(L(d—i/C(d ) 2 q> ’ 

1 - 

(PC C(0 l^i? 2 + (Leo — 1 /Cat) 2 (P ‘ 


If we denote by 0 O the natural frequen- 
cy of the circuit, then 0 2 = 1/LC. 
These formulas can now be written 
thus: 




co 2 (p 


]/■ + (<»§-® 2 ) 2 


9c 


ogq> 


/ 


i? 2 C0 2 

L 2 ~ 


-CO 2 ) 2 


(13.13.11) 


In this form it is quite evident how the 
ratio cp L /cp or cp c /cp depends on the close- 
ness of the natural frequency of the cir- 
cuit, 0 O , to the driving frequency 0 . 

The ratio cp L /cp as a function of 0 near 
co = 0 O is shown in Figure 13.13.2. The 
graph is constructed for the case of 
i?/Lco 0 =0.05. It is illustrative of the 
typical resonance curve . 

If RIL($ < 1, the dependence of 
cp L /cp or cp c /cp on 0 is mainly deter- 
mined by the second term of the radicand , 
( 0 * — co 2 ) 2 . When 0 = 0 O , this term 
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vanishes, and under the assumption 
that jR/Lco < 1 the denominator has a 
minimum and the ratio q) L /(p or cp c /(p 
a maximum. The ratio constitutes 70% 
of the maximum value when (co 2 — 
co 2 ) 2 = jR 2 co 2 /L 2 , that is, at co 2 — co 2 = 
zfuRco/L, whence 


co 0 — co = 



co , R 

-j- 

co 0 + ca 2 L * 


The variation in frequency for which 
the square of the ratio cp L /cp or cp c /cp 
falls to one half of the square of the 
maximum value is called the half- 
width of the resonance curve. If the ra- 
tio is 70% (0.7) of the maximum, its 
square comes to 0.7 2 ~ 0.5 of the ma- 
ximum value of the square. Therefore, 
the half-width of the resonance curve, 
co — co 0 , constitutes RI2L, which means 
that the half-width (also called the 
half -width at half -maximum) is equal 
to the quantity that characterizes the 
rate of decay of oscillations in that cir- 
cuit (see Section 13.9). 

Consequently, the smaller the resist- 
ance j R, the smaller the half-width 
of the resonance curve and the steeper 
the curve near co = co 0 . From formu- 
las (13.13.11) we see that the smaller 
the resistance R the greater the maxi- 
mum of the ratio cp L /cp or cp c /cp. That is 
why the phenomenon of resonance is 
particularly evident if R is small. 


13.14 Inductance and Capacitance 
in Parallel. Parallel Resonance 



Figure 13.14.1 


extremely small (in other words, we 
neglect it). Then cp c and cp L coincide 
and are equal to the voltage cp in the 
circuit (that is, of the ac source), while 
the current / is made up of the cur- 
rent j c flowing through C and the cur- 
rent j L flowing through L . Let cp L = 
cp c = cp = cp 0 cos (co* + a). Using the 
formulas of Section 13.11, we find that 

j c — — Ccocp 0 sin (co£ + a ) , 

Jl — sin (at 4- a). 

Therefore 


/ = 7c +/'n 

=<Po (t ^ — C<0 ) sin ( co *+ a )> 

whence, assuming that co 0 = 1/]/ LC , 



l/Z-G) — C(0 ' C ((Dy — (0 2 ) • 


In this case too we see a typical re- 
sonance relationship: for a given cur- 
rent /, the voltage cp is the greater the 
closer co is to co 0 . It is easy to see that 
when co is close to co 0 , j L and j c are 
much greater than the current j in the 
circuit, that is, an ac circuit containing 
L and C experiences strong oscillations. 
A small external current suffices to 
sustain much stronger currents in the 
circuit. 

It will be recalled that in a parallel 
circuit, conductances (which are the re- 
ciprocal of resistances) are additive: 


Consider the circuit in Figure 13.14.1, 
which differs from that in Figure 13.13.1 
in that L and C are in parallel. We 
take the resistance of the circuit to be 


The “conductances” (that is, the ra- 
tios of current to potential difference) 
of a capacitance and an inductance have 
opposite signs and depend differently 


29 * 
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Figure 13.14.2 

on the frequency. At resonance (co = 
<x> 0 ) they cancel each other and the total 
“conductance” is at a minimum, which 
is to say, the current is the smallest 
for a given potential difference and, 
hence, the potential difference cp AB is 
at a maximum for a given current in 
the external circuit. 

In a simplified circuit without resist- 
ance, the amplitude of oscillations 
grows without limit as co approaches co 0 . 
In reality, the resistance in the circuit 
makes the amplitude finite when co — 
<o 0 . 

If R is connected in parallel with L 
and C, all calculations become very 
similar to those of the preceding section. 
But this case is rarely encountered 
in practical situations. Ordinarily, the 
inductance has a perceptible “resistan- 
ce” and therefore the typical circuit 
diagram is that shown in Figure 13.14.2. 
In this case the calculations are some- 
what longer than in the preceding sec- 
tion and we will not carry them through 
in detail. The result for co close to co 0 
and for small ratios i?/Lco is 

j l ^ ic ^ 

T'T = ' / (i?0)/L) 2 + (0)o — G ) 2 ) 2 • 

It thus turns out that current amplifica- 
tion at resonance in a parallel ac circuit 
obeys the same law as voltage ampli- 
fication in a series circuit discussed in 
Section 13.13. 

13.15 General Properties of Resonance 
in a Linear System 

Note in our reasoning that when we 
added oscillations caused by a force at 
different moments of time, we assumed 
all along that the system without that 
force was linear . The behavior of the 
system is described by a linear differen- 


tial equation, and the unknown quan- 
tity appears in all terms in the first pow- 
er. Therefore, the superposition prin- 
ciple is valid, that is, the sum of two 
or several particular solutions of the 
equation is also a solution. 

In Section 13.13 we obtained a for- 
mula for the half-width of a resonance 
curve, co — co 0 = RI2L, which shows 
that the slower the decay of oscilla- 
tions the smaller the half-width. This 
does not occur only with respect to elec- 
tric oscillations. We can consider any 
system capable of oscillation. Let an 
external force give rise to oscillations in 
such a system and then cease to operate. 
The system is now on its own. The 
oscillations begin to decay. If the am- 
plitude of oscillations decays like e ~ '?*, 
then y characterizes the rate of decay 
(it has the dimensions s _1 ). During time 
t = 1/y the amplitude diminishes by a 
factor of e, or by 63%. 

Let us now consider the resonant step- 
up of such a system by a periodic exter- 
nal force. The amplitude of oscillations 
at a given time is the sum of the ampli- 
tudes acquired during the time of step- 
up. In the presence of damping, an 
amplitude acquired a long time before 
will have time to decay and will not 
play any part or make any contribution 
to the amplitude of oscillation at the 
given time. 

The decay time is clearly 1/y. In this 
time the amplitude of free oscillations 
will have diminished e times, or by 
63%. Hence, even when the stepup 
force is constantly operating from t = 
— oo, the oscillation amplitude will still 
be determined solely by the time inter- 
val from t — 1/y to t , where t is the 
time of observation. The action of the 
force at earlier times will have already 
decayed. 

For the difference between two perio- 
dic forces with somewhat distinct peri- 
ods, F o sin co 0 £ and F 0 sin co£, to mani- 
fest itself conspicuously, a time T of 
observation is needed during which their 
phases will have separated by approx- 
imately n units: coT = c o 0 T ± rc, so 
that | co — co 0 | — n/T. Consequently, 
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if the oscillations of a system “remem- 
ber” only the action of the force during 
time T = 1/y, then in such a system a 
difference n/T = ny in the frequencies 
of the exciting force hardly affects the 
amplitude. From this we see that the 
half-width of the resonance curve is 
proportional to the decay factor y. 

On the other hand, since the system 
“remembers” and accumulates the action 
of a force during time 1/y, the oscil- 
lation amplitude at resonance (hence 
also the height of the resonance peak) is 
inversely proportional to y. Calculations 
confirm this reasoning. 

We would like to remind the reader 
once more that the system we are con- 
sidering is, in the absence of an exter- 
nal force, linear. The behavior of such 
a system is described by a linear differ- 
ential equation, thus making the su- 
perposition principle formulated above 
quite applicable. For nonlinear systems, 
the superposition principle fails, and 
the theory of such systems proves much 
more complex. Also note that a linear 
system does not generate oscillations 
even if a constant emf source (an ideal 
source) is present in the circuit. 

13.16* Displacement Current and the 
Electromagnetic Theory of Light 

Up to now we have almost everywhere 
considered current flow through a capac- 
itor without any reservations. Indeed, 
if we connect a capacitor in an ac cir- 
cuit in series with an ammeter, the 
ammeter will indicate a definite cur- 
rent j = C cocp. On the other hand, no 
current flows through a capacitor be- 
cause the plates of the capacitor are sep- 
arated by an insulator (air or a mate- 
rial or even a vacuum), and so the in- 
dividual current carriers (electrons) in 
the left-hand conductor and plate will 
never get over to the right-hand plate 
and conductor. Consequently, no charged 
particles are in motion in the space 
between the plates, that is, there is no 
electric current in the sense that we have 
spoken of current up to now. All there 
i^ in this space is an electric field that 


varies when the charge on the plates, 
varies, which is to say, when a current 
flows in the right- and left-hand conduc- 
tors. We can now do one of two things: 

(1) either beg the reader’s pardon and 
explain that whatever we have spoken 
of current flowing through a capacitor 
(capacitance) this was not so — actually 
there was no current, the only current 
flow being in the conductors on the 
right and left; 

(2) or regard the varying electric field 
in the space between the plates on a 
par with ordinary current (the motion 
of charged particles). Maxwell, who 
suggested this view, was able to draw 
conclusions of tremendous significance. 

It had long since been known that 
electric current (the motion of charged 
particles) gives rise to a magnetic field . 
But if a varying electric field is similar 
to an electric current, an electric field 
varying in a vacuum should also set up 
a magnetic field. This hypothesis of 
Maxwell led to a remarkable symmetry 
between electric and magnetic fields. 
Faraday experimentally discovered in- 
duction, that is, the fact that any vari- 
ation of a magnetic field gives rise to 
an electric field. Maxwell, in strictly 
theoretical fashion, hypothesized the 
existence of a similar phenomenon in 
which any variation of an electric field 
gives rise to a magnetic field. Only then 
did the theory of electric and magnetic 
fields acquire its modern form. 

The mathematical theory of Maxwell 
is written in the form of differential 
equations that are too complicated for 
this book and so we do not give them. 13 13 


13 • 13 The characteristics of electric and 
magnetic fields in space are functions of 
several variables — they depend on the three 
coordinates of a point in space and on time. 
Accordingly, the derivatives that are present 
in the Maxwell equations are the partial 
derivatives of functions of several variables, 
and the equations themselves belong to the 
class of partial differential equations (in the 
Conclusion we touch on such equations). 
Moreover, these characteristics are vector 
quantities. Finally, note that complex num- 
bers and the theory of analytic functions of 
a complex variable (see Chapter 17) fit ele- 
gantly into the theory of electromagnetism. 



454 


13 Electric Circuits and Oscillatory Phenomena in Them 


The solutions to these equations describe 
the propagation of electric and magnetic 
fields in empty space. Both fields must 
be present at all times: a variation in 
the electric field gives rise to a magnet- 
ic field, and any change in the magnet- 
ic field generates an electric field. 

At the time when Maxwell worked, 
Faraday’s experiments had already been 
completed, that is, the relationship 
between a varying magnetic field and 
an emf induced by it was known. Also 
known was the magnetic field of a cur- 
rent. Finally, the relationship between 
the charge on a capacitor and the ele- 
ctric field between the plates was like- 
wise known. These findings sufficed for 
writing down the equations for fields 
in empty space. 

Maxwell found the rate of propaga- 
tion of the fields in vacuo. This velocity 
proved to be equal to that of light! From 
this it was natural to conclude that 
light is nothing other than electromag- 
netic oscillations. Furthermore, the theo- 
ry predicted the possibility of the exist- 
ence of electromagnetic oscillations 
of any wavelength including X rays 
(whose wavelength is thousands of 
times shorter than that of visible light) 
and radio waves with very large wave- 
lengths. It was thus that the investiga- 
tions of Faraday and Maxwell began 
the work that culminated in the discov- 
ery of radio waves by Hertz, the in- 
vention of radio as a means of communi- 
cation by the Russian A. S. Popov and 
the Italian Marchese G. Marconi, the 
discovery of X rays by Wilhelm K. 
Rontgen, and the creation of a complete 
theory of the electromagnetic field 
and the interaction of this field with 
matter. 

13.17* Nonlinear Resistance and 
the Tunnel Diode 

Let us consider a two-terminal network 
(a “box”) that is similar to a resistance 
in the sense that current flowing through 
the “box” depends solely on the instan- 
taneous value of the potential differ- 
ence. In this respect, the “box” is not like 


an inductance, where cp depends on 
djldt, neither is it like a capacitance, 

where cp depends on j j dt. However, 

the “box” differs from an ordinary resist- 
ance in that the function j (cp) differs 
from Ohm’s law / = cp !R. The “box” 
has a more involved function / (cp). 
This function is called the characteristic 
curve of the “box”. 

The only general assertion that can 
be made with respect to j (cp) is that cp 
and j cannot have different signs if 
batteries or some sources of energy are 
not hidden in the box. If cp and j are 
of the same sign, energy is absorbed in- 
side the box during current flow and 
the box takes up electric energy from 
the circuit to which it is connected. In 
the box this electric energy is convert- 
ed into heat and is dissipated. Since 
cp is constant and negative when j < 0 
and positive when / > 0, it follows 
that cp = 0 at / = 0. In all other re- 
spects the functional relationship be- 
tween / and cp can be of any kind. For 
example, for current rectifiers we have 
boxes whose characteristic curve is 
shown in Figure 13.17.1: the current 
flows easily in one direction for a small 
potential difference and hardly at all 
in the other direction. As can be seen 
from the graph, the current is small 
even for a large negative potential differ- 
ence. Such are the properties possessed 
by diodes made of two semiconductors. 
In 1958 the Japanese devised a “box” 
made up of specially chosen semiconduct- 
ors, the tunnel diode (in reality, this 
“box” is in the form of a minute cylin- 
der just a few millimeters in diameter 
and altitude), which has an unusual 
curve of j (cp) with a minimum (see 



Figure 13.17.1 
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Figure 13.17.2 in which typical values 
of cp and j are indicated ). 1314 This curve 
does not contradict the principle ex- 
pressed above: the sign of cp is the same as 
that of 7 everywhere, which means the 
“box” only absorbs energy. We will not 
go into the physical reasons for such a 
strange curve, but we will examine the 
consequences for a circuit involving a 
tunnel diode. For the sake of brevity 
we will continue to call it a box. 

We start with the simplest type of 
circuit consisting of three parts: a bat- 
tery with emf E , a resistance R (the or- 
dinary kind that obeys Ohm’s law), 
and the box (Figure 13.17.3). We in- 
clude the internal resistance of the bat- 
tery in R. 

The equation defining the current and 
distribution of potential in the circuit 
is of the form — E + Rj (cp) + 9 = 0, 
where cp is the potential difference across 
the box and / (cp) is the function defined 
by the properties of the box (see Fig- 

E R Box 

Hi — w □ — i 


Figure 13.17.3 


13,14 By the end of the 1970s both physi- 
cists and designers had greatly advanced the 
miniaturization of all parts of electronic 
devices. The modern diode (or capacitor or 
resistor) now comes in the form of a tiny spot 
a dwindling fraction of millimeter in diameter 
and consists of multiple layers with the speci- 
fied properties, hundredths or even thousandths 
of a millimeter thick. 



ure 13.17.2). The current through R is 
equal to the current through the box, 
and so cp R = Rj = Rj (cp). This equa- 
tion is conveniently solved by graph. 
We write it down as 

cp = E — Rj (cp) (13.17.1) 

and we construct in the cp/-plane the 
straight line cp = E — Rj . This line 
may be called the load curve of the 
battery-resistance system. The solution 
to the problem is given by the intersec- 
tion of the straight line (13.17.1) 
(that is, the straight line j — (E — cp)/ 
R) with the j (cp) curve, which is the 
characteristic curve of the box. In Fig- 
ure 13.17.4 we have a graphic solution 
of the problem involving one battery 
and three different resistances: R ± 
(small), R 2 (medium), and R s (large). 

From the graph we can see that for a 
sufficiently large E we can choose an 
R such that it will not be too small or 
too large and there will be three points 
of intersection, A, B , and C, and thus 
three solutions to Eq. (13.17.1). 

For three solutions to exist, the j (cp) 
curve must have a descending portion. 
It is clear that the line on which dj/dy 
is positive everywhere can only once 
intersect the load curve, no matter 
what E and R > 0. 

Now let us consider a somewhat more 
complicated circuit diagram involving 
capacitance in parallel with the box 
(Figure 13.17.5). For this circuit we 

£ /? Box 

— 1 1 — — *— □ — * — 

' — ii — 1 

£ 

Figure 13.17.5 
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find that the current through the box, 
/ (cp) , and the current flowing through 
the capacitance, C (dyldt), together equal 
the current flowing through the re- 
sistance, 


£ R 

Hf 


R Box 

A N\i — <j> — Q — 9—i 
kjuuLHH 


L C 


i ( 9 ) + c 


dx p 
dt 


E — cp 
R ’ 


whence 


The intersection points of the char- 
acteristic curve 7 (cp) of the box with 
the load curve ( E — cp )/R correspond to 
the solutions cp = constant, dyldt = 0 . 
Let us examine the sign of dyldt near 
these points. A glance at Figure 13.17.4 
shows us that dyldt is positive if cp < 
cp A or 9 b < 9 < 9 c an d negative if 
9a < 9 < 9b or 9 > 9c- 

The arrows in Figure 13.17.6 indicate 
the direction of variation of (p with 
time. We can see that the intermediate 
solution at point B is unstable : all we 
need to do is depart slightly to the left 
or to the right, and dy/dt acquires a 
sign such that the deviation of <p from 
cp B increases. On the other hand, the 
points A and C in Figure 13.17.4 charac- 
terize two stable solutions, which cor- 
respond to stable states of the system. 

The existence of two stable states 
permits using these boxes in mathemat- 
ical machines as memory cells. By 
making a lot of such circuits and trans- 
ferring (by an external means) some in- 
to the A state and others into the C 
state, we can record (“remember”) any de- 
sired number or other information. 
Using such systems, we record informa- 
tion in coded form as AACACACCC ..., 
where each letter A or C indicates a 
state of the appropriate system (the first 
in A , the second in A , the third in C, 
the fourth in A, etc.). 


_JfA 



Figure 13.17.6 


Figure 13.17.7 


We now consider a system (Fig- 
ure 13.17.7) consisting of an inductance 
and a capacitance connected in parallel 
with the box. We again denote by cp the 
potential difference across the box, 
7 = 7 (cp) the characteristic curve of the 
box, and ]\ the current flowing through 
L and C : 

<P — 9 c = <Pi, = £-^-. (13.17.2) 


We consider the process in the circuit 
when the current and the potential in 
the box are close to the intermediate 
point B of intersection. Let 

9 = 9b + A 9c — 9b + g, 

i — 1 (<Pb) + / -4~ I m =/(q> B ) + fc/. 

“9 lv=(p B 


where k stands for the derivative dj/dy 
at cp = cp jg. We used the first two terms 
of the Taylor series to represent cur- 
rent; the intermediate potential of the 
capacitance is cp B , and cp# and 7 (cp B ) 
satisfy the condition 7 (cp B ) = (E — 
cp b )/B. Substituting these expressions 
into Eq. (13.17.2) and simplifying, 
we get 

k f + h = —i-, cJ W==h, 


f — g 



(13.17.3) 


From these equations it follows that 


/ = 


1 

l/R + k ]i ~ r]u 


(13.17.4) 


where r -1 = i ? _1 + k = + 

(dj/dq>), p=«p B . Equations (13.17.3) and 

(13.17.4) also yield 

r = df__ j£_ — _ Hi L i 

dt 2 dt dt dt C Ji ’ 
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and finally 

-S L + xt- + -5T>'. = 0 ’ < 1317 - 5 > 

which is the ordinary equation describ- 
ing an oscillatory circuit with capaci- 
tance C, inductance L, and resistance 
r. The resistance r reflects the fact that 
the capacitance and inductance are in 
parallel to two circuits, the circuit in- 
volving the battery and resistance R 
and the circuit with box and resistance 
k~ x = dy/dj. Since the two circuits are 
connected in parallel, the conductances 
(reciprocals of resistances) are additive, 
whence follows the expression for r. 

The currents and potentials in the 
system break up into two terms: the 
constant term (<p B , / (cp B )) and the os- 
cillatory term (j 1 ( t ), / ( t ), g ( t )). Here, 
for the oscillatory term the role of re- 
sistance of the box is played by the de- 
rivative d<p/dj taken along the charac- 
teristic curve. If the box consisted of 
an ordinary (ohmic) resistance, <p = i?/, 
the derivative would be equal to the 
resistance, dy/dj = R. 

What does the unusual characteristic 
curve / (cp) of the box (tunnel diode) of 
Figure 13.17.4 lead to? At point R the 
derivative djldg) is negative, which means 
that with respect to oscillations the 
box has a negative resistance! What is 
more, from Figure 13.17.4 it is evident 
that at point R we have | k | = | djldcp |> 


> R _1 , since R~ x is precisely the 
slope of the load curve / = (E — cp)//? 
intersecting the characteristic curve at. 
the point B. Consequently, the total, 
resistance of the oscillatory circuit, 
r = (R' 1 + fc) _1 , is negative. 

The equation for an oscillatory cir- 
cuit involving L, C, and r, with r posi- 
tive, yielded damped oscillations. For 
r < 0 this equation will yield stepup> 
oscillations, which grow with time. 
Thus, a tunnel diode is capable of gen- 
erating oscillations in a circuit. 

The capability of generating oscilla- 
tions is a consequence of the instability 
of the solution at point B. The oscilla- 
tion energy is taken from the battery. 
The oscillation amplitude increases with 
time by an exponential law only so long 
as it may be considered small and we 
can employ the Taylor series expansion 
of the characteristic curve / (cp) about, 
the point B. Roughly speaking, the 
maximum amplitude is limited by the 
points A and C in Figure 13.17.4. 

Already by 1961 oscillators with, 
efficiencies up to 25% and power out- 
puts of 0.5 MW at 7500 MHz (at a 4-cm 
wavelength) had been tested. For us, 
tunnel-diode circuits are interesting 
from the standpoint of a mathematicaL 
consideration of a nonlinear problem, 
questions of power output, and the re- 
presentation of currents in a system as a . 
superposition of a constant solution and* 
oscillations. 
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Chapter 14 Complex Numbers 


14.1 Basic Properties of Complex 
/Numbers 


The use of complex numbers can shed 
additional light on some topics discus- 
sed in this book. Originally, the number 
(natural) characterized the number of 
objects in any (finite) their set, for 
instance the number of children in a 
family, of boats on a river, or of the 
fingers of the hand. Numbers can be ad- 
ded: if we unite two groups of people 
consisting of a and b persons, we will 
get a + b people altogether. Numbers 
can also be multiplied: a bunches of b 
flowers each will yield ab flowers. But 
sometimes natural numbers cannot be 
subtracted or divided. Manipulating 
with just natural numbers we cannot 
subtract 5 from 8 or divide 3 by 2. 

The quest for carrying out operations 
«of subtraction and division led to 
the appearance of negative and fraction- 
al (rational) numbers : it was accepted 
to designate the difference 5 — 8 as — 3 
and the dividend 3 2 as 3/2. New 

numbers had quite a different meaning 
since neither group of people can contain 
— 3 or 3/2 persons. (Not without rea- 
son, Samuel Marshak, an outstanding 
Soviet poet and translator, wrote about 
a schoolboy who on solving the problem 
^obtained 2 2 / 3 diggers.) Operations per- 
formed on such numbers are also defined 
in a new way; indeed it is senseless, 
say, to unite — 3 boats and 3/2 boats in 
fleet. We have to assume, for instance, 
that the sum of the fractions alb and 


dd is equal to (ad -f- bc)lbd , and the pro- 
duct of ( — a) by (—b) is -\-ab. u l 

That numbers can be multiplied with- 
out limit leads to the operation of rais- 
ing a number to a (integral positive) pow- 
er: a 2 = a-a, a 3 = a-a-a, and gener- 
ally a n = a -a -a ... a. But the in- 

n times 

verse operation, that is, taking roots of 
ordinary numbers, can not always be 
performed: there is no (positive or nega- 
tive) number x whose square would be 
equal to, say, — 7, that is, there is no 
number x = Y —7. In what follows we 
will call ordinary numbers, that is, pos- 
itive, negative, and zero, real num- 
bers . 14 2 Attempts at taking (square) 
root of any real number lead also to the 
notion of complex numbers. 

14 - 1 These conventions only make it possi- 
ble to apply all well-known operations per- 
formed on “old” numbers to “new” numbers 
(when extending the notion of number we 
have just to refresh our knowledge and not to 
start from the very beginning; so, according 
to our rules a -f- b = all + b/i = (a X 1 + 
1 X b)l 1; X 1 = (a + b)f 1 = a + b). All 
the general rules of operation are also valid 
(for instance, such as a + b = b + a or 
(a + b) c = ac + be). The same reasoning 
also holds in considering the rules of operation 
on complex numbers we are dealing with in 
this chapter. 

14 • 2 To make it possible to take square root 
of any positive number, we have to include 
in the real numbers not only all (positive and 
nonpositive) fractions (rational numbers), 
but also irrational numbers, such as V 2 = 
1.41 . . .. (In this connection see, for instance, 
I. Niven, Numbers : Rational and Irration- 
al, , New York, 1961, which is intended for 
a wide circle of readers.) 
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We introduce a new “number” ]/" — 1 
defined by the condition that its square 
equals the number — 1 . Clearly, none of 
real numbers — either positive or nega- 
tive — possesses such a square, and there- 
fore we call it the imaginary unit ; in 
mathematics it is usually denoted i. 
Expressions of the form c = a + ib or 
z = x + iy, where a and 6 (or x and y) 
are real, are named complex numbers. 
Numbers of the form ib ( = 0 + ib) or 
iy are sometimes called pure imaginary 
(or simply imaginary). 

Four arithmetical operations on com- 
plex numbers a = a x + ia 2 and b = 
b 1 + ib 2 directly follow from the de- 
finition of the imaginary unit. They are 
introduced as the natural generalization 
of ordinary operations on real numbers: 
if a 4- b = /, a — b = g, and ab = h, 
then 

/ = /i + if 2 = ( a i + id 2 ) + (61 + ib 2 ) 
— (a 1 + &i) + i {a 2 + b 2 ); 

S — g 1 + = (% + ^2) — (&l + ^2) 

= («1 — 61) + i (a 2 ~ 62); 
h = h x + ih 2 = {a x + ia 2 ) (b x + ib 2 ) 

= a 1 b 1 + ia x b 2 + + ia 2 i6 2 

= (a 1 6 i — a 2 & 2 ) + i (a x 6 2 + a 2 b x ) 

{in calculating h we take into consider- 
ation that i 2 — 1); therefore 

fi — a i + b v f 2 = a 2 + 6 2 ; 

£1 — a i — b 1, £2 = ^2 — ^2; 

= a^ — a 2 6 2 , h 2 = a x b 2 + a 2 b v 

Similarly, if k = alb = k x + ik 2 , then 
a = 6&, that is, 

+ 162) (&i + 2 ) = (6^ — 6 2 /c 2 ) 

4 " i {b±k 2 4~ b 2 k x ) = a 1 4* ^2* 

For determining and /c 2 we have now 
two linear equations, 

*1*1 — 6 2 & 2 = a x and b 2 k x 4~ b x k 2 = a 2 , 
which can be solved in all cases where 
6^0 (that is, when b x and 6 2 are 
not zero simultaneously): 

7 . #l 4 l 4 ~# 2&2 7 #2&1 &1&2 

1 ^ 4-^2 ’ 2_ b*+bl • 


When finding the quotient k = a/6, 
we can use the fact that at any (com- 
plex) number 6 = 6 X 4- *6 2 the pro- 
duct of 6 by the number 6* = b x — i6 2 
(the number 6* is often denoted 6" and 
called the conjugate to b) is always real: 

bb* = (b t + ib 2 ) (b x - ib 2 ) 

= k- (ib 2 )* = b\- m 
= K + b\. ( 14 . 1 . 1 ) 

(If 6* = 6, the number 6 is real; the 
square root of the product 1^66* = 
Y b\ 4- b\ is denoted | 6 | and called 
the absolute value , or modulus , of the 
complex number 6.) Therefore 

7 a_ __ ai + *q 2 __ (gi + ta 2 ) (frj — *fr 2 ) 

6 &i 4"^2 (^1 4 * ^2) (^1 — ^2) 

__ (^l^l 4 ~ g 2 ^ 2 ) 4 ~ i (#2^1 — ^1^2) 

gl&l 4 - fl 2^2 | • #2^1 gl&2 

“ m* m* ' 

Our rules permit representing any 
rational function of the complex 
variable z = x + iy (for instance, / = 
z/( 1 4- z 2 )) as the sum / = u 4~ iv, 
where u and v are two real-valued func- 
tions of two real variables x and y 
(see Exercise 14 . 1 . 1 ). 

The problem of taking the square root 
of a complex number can be solved in 
a similar way. Indeed, if r — r x 4 - 
ir 2 = y a = y a x 4“ id 2, then (r t 4~ 
ir 2 ) 2 = (1 r\ — r\) 4- i 2 r x r 2 = a = a x + 
ia 2 , which results in the following sys- 
tem of equations for determining the 
unknowns r x and r 2 : 

r\ — r\ = a v 2 r x r 2 = a 2 . 

This system is readily solvable. Since 
r\ + (— rl) = % and r\ (— r\ ) = 
— (a 2 / 4 ), r\ and — r\ are roots of the 
quadratic equation X 2 — a x X — 

(a^/ 4 ) = 0 , whence 

1 / \a\+ai „ 1 /* |a| — 

r i = |/ 2 ’ r 2 “ V 2 ’ 

where, as usual, | a | = "J/^a 2 4~ 

(Note that both integrands are positive.) 
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Figure 14.1.1 


Finally we have 
Y a = Y at + ia 2 

1 a| + ai +i |/ |g| g -- 1 - ) . 

Thus, introducing the new, imagin- 
ary, number £ in order to make solvable 
the problem of taking the square root of 
— 1, we unexpectedly were able to 
take square roots of all numbers — 
“old”, real, and “new”, complex. 

Before going further we will consider 
the geometric representation of complex 
numbers. The complex number z = iy 
is usually considered as a point of the 
so-called complex plane (or the plane of 
the complex variables) with (Cartesian) 
coordinates x and y (Figure 14.1.1). 
Here the .r-axis is called the real axis 
(its points correspond to the real num- 
bers x + £0 = x) and the y-axis, the 
imaginary axis. To each complex num- 
ber there corresponds a single point 
of the plane and vice versa; to conjugate 
complex numbers there correspond 
points which are symmetrical about the 
real axis. The absolute value ]/ zz* = 

| z | = Y x 2 + y 2 of the complex num- 
ber z = x -\- iy is equal to the distance 
Oz of point z to the origin (corresponding 
to the number 0). 

The straight line b ± x + b 2 y + c — 0 of 
the plane of the complex variable z = x + iy 
can be given by the “complex equation” 

bz + b*z* + c = 0, (14.1.2) 



where c is real, that is, 

1 

c* = c , and b - (b 1 — ib 2 ). 

Indeed, if z = x + iy, then the left-hand side of 
(14.1.2), as is readily seen, equals b x x -\-b 2 y + c. 
The equation (x 2 + y 2 ) + b ± x + b 2 y + c = 0' 
of the perimeter of the plane of complex 
variables can be rewritten in the form similar 
to (14.1.2): 

zz* + bz + 6*z* — f- c = 0, (14.1.2a)- 

with c* = c and b = (1/2) (6 X — ib 2 ), since 
zz* = x 2 + y 2 and bz + b*z* = b x x + b 2 y. 

Equations (14.1.2) and (14.1.2a) can be- 
combined as follows 

azz* + bz + 6*z* + c = 0, (14.1.2b) 

where a and c are real (a* = a, c* = c). Equa- 
tion (14.1.2b) describes a circle if a =£ 0, 
and a straight line if a — 0. 

The position of point zona complex 
plane can be also characterized by its 
polar coordinates , the distance p of 
point z to the origin O (which is the 
absolute value | z | of number z) and 
the angle cp between the ray Oz and the 
positive ray of the real axis (angle cp 
is called the argument (amplitude or 
phase) of number 2 ; sometimes it is 
denoted Arg z). Here (see Figure 14.1.1) 
# = pcoscp, y = p sincp, 

P -=V x 2 -\-y 2 — |z|, 

. y x . y 

tan = T ’ c °s cp = , sm cp = . 

Thus 

z = p (cos cp + £ sin cp) (14.1.3) 

(which is the trigonometric form of com- 
plex number). 

The form (14.1.3) is especially con- 
venient for multiplication and division* 
of complex numbers 

z i — Pi ( cos Ti + £ si n Ti) an( 3 
z 2 = P 2 (cos cp 2 + £ sin cp 2 ). 

Indeed, it is clear that 

z f z 2 == p A (cos cpj + i sin cp 4 ) p 2 (cos cp 2 

-r £ sin cp 2 ) = pjp 2 [(cos <p 4 + £ sin cp 4 ) 

X (cos cp 2 + i sin cp 2 )] 

= p 4 p 2 [ (cos cp d cos cp 2 — sin cp 4 sin cp 2 ) 

+ £ (cos cp 4 sin cp 2 + sin cp A cos cp 2 ) ] 

= P 1 P 2 [ cos (cp 4 + cp 2 ) + £ sin (q> 4 + T 2 ) b 

(14.1.4) 
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or in words: when multiplying complex 
numbers their absolute values are mul- 
tiplied and the arguments are added. 
This leads to the following. If zjz 2 = 
z 3 — P 3 ( cos 93 + & sin 93 )> then z 1 = 
Z 2 z 3l that is, pi = p 2 p 3 , 9i = 92 + 9s. 
whence p 3 = p x /p 2 , 93 = 9i — 92 - 
Thus, 


iL = -£l [ C ° s (cp, _ cp 2 ) + i Sin (cp, — cp 2 ) ] , 
Z 2 P2 

(14.1.4a) 

when dividing complex numbers their 
absolute values are divided and the argu- 
ments are subtracted. 

Note that the rules 


IZ 1 Z 2 I = l Z ll l Z 2 I » 


1ST (14J - 5) 


completely coincide with those for real 
numbers; while the rules for the argu- 
ments of complex numbers: 


Arg(z,z 2 ) = Argz,+ Argz 2 , 

(14.1.5a) 

Arg j = Arg z, — Arg z 2 , 


are new to us. In form, Eqs. 
{14.1.5a) resemble the rules of operation 
on logarithms, but it is not yet quite 
clear what relates the angle Arg z = c p 
to logarithms (do complex numbers 
have really logarithms? We will answer 
this question later on). 

Multiplying, according to Eq. 
{14.1.4), the number z = p (cos cp + 
i sin cp) by itself n times we obtain the 
Moivre formula 

z n — p n ( cos n( p _[_ 1 s [ n (14.1.6) 


This formula is also valid for negative 
and fractional n. For example, noting 
that 1 = 1 (cos 0 + i sin 0) we get 

Z" 1 = y = y [cos( — cp) + isin( — cp)] 

= — (cos cp — i sin cp) (14.1.6a) 

and 

z i/z =Y r (cos a -f i sin a) 

= 3 /r (cos -y + i sin y ) (14.1.6b) 



Figure 14.1.2 


(see Exercise 14.1.2). Note that for- 
mula (14.1.6b) yields three values of 
the cube root of number z: since in- 
stead of a the argument z can be writ- 
ten as a + 2k or a + 4 k, we have 

z t = Y r (cos (3 + i sin (3), 

z 2 = V r [ cos (P + -y-) 

+ isin (p + -y-)] , 
z 3 = 3 /r [cos (p + 

+ isin (p + -y-)] , 


with (3 = a/3 (Figure 14.1.2). 

Thus, the use of complex numbers 
makes it possible to find three roots 
z lt z 2 , and z 3 of the cubic equation 
z 3 — c = 0, where c — r (cos a + 
i sin a) is an arbitrary complex num- 
ber. Moreover, we can show that each 
algebraic equation of nth degree 

P (z) = a 0 z n -f- a^ 71-1 -f- a 2 z n ~ 2 + . . . 

. . . + a n _iZ + a n = 0 , 

where a 0 = 7 ^ 0, a l5 a 2 , . . ., a n are ar- 
bitrary complex numbers, has n (and 
only n ), generally speaking, complex 
roots z ±1 z 2 , . . ., z n which are not nec- 
essarily distinct (some time ago this 
important proposition was called the 
fundamental theorem of algebra). Here 
the polynomial P (z) is the product 
of n linear factors: 

P (z) = a 0 (z — z,) (z — z 2 ). . . 

. . . (z — z n ). 
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For example, the quadratic equation 
Q {z) = z 2 + pz + q = 0 (14.1.7) 

has two roots 

Zl , 2 =-f±|/ 4—9- (14.1.7a) 

with 

<? (z) = (z — Zl) (Z — Z 2 ). 

The roots (14.1.7a) of Eq. (14.1.7) will 
be real for real p, q and distinct for 
p 2 / 4 — q > 0 (recall that the real num- 
ber # = # + iO is a particular case 
of a complex number); the roots will 
be the same (equal to —p/2 for p 2 /4 — 
q = 0; and complex-conjugate (equal 
to —p/2 ±iY q — p 2 /4, the radicand 
being now positive) for p 2 / 4 — q < 0. 
(When we say that for real p, g and 
p 2 / 4 < q the quadratic equation (14.1.7) 
/ms /2o solution, we are not mistaken 
and just want to say that manipulating 
only with real numbers we have to 
consider in this case that Eq. (14.1.7) 
has no roots.) Equation 

Z 5 __ 32 4 + 7z s _ + 12z __ 4 

= (z - l) 3 (z 2 + 4) = 0 (14.1.8) 

has five roots, of which only three are 
distinct: 

Zi = z 2 = z 3 = 1, z 4 = 2i, and 
z h = —2 i. 

If all the coefficients of the equation (of 
any order) a 0 z n + a^ 71 - 1 + - • . + 0?i-i z + 
a n — 0 are real, then to each “essentially com- 
plex” (i.e., not real) root z — x + iy of the 
equation there corresponds a second root 
z* = x — iy, which is complex-conjugate to 
the first root. This follows from the rules of 
operation on complex numbers, according to 
which 

(Zl + Z 2)* = Z * + Z h ( Z 1 — Z 2 )* = Z* — Z*, 

(*1*2)*= *1*2 » 



(check it). By virtue of (14.1.9), if, for in- 
stance, z 0 = x 0 -\- iyo is the root of the equation 
Q (x) = a 0 x 4 + a 4 x 3 + a 2 x 2 + a z x + a 4 = 0 
with real coefficients, that is, 

Q ( * 0 ) = a o z o + a i z o + + H = 0, 


then 

«0 ( Z o) 4 + «l* (z*) 3 + «2 (z*) 2 + «3 i Z o)-T a t = 0*. 

But a$ = a 0 , ..., af = a 4 and na- 

turally 0* = 0, therefore 

Q ( z *) = «o ( 2 o) 4 + «i ( z o) s + a 2 ( z *) 2 

+ «3 ( r o) + = 

that is, z% = Xq — iy 0 is also a root of the same 
equation (distinct from z 0 for y Q =£ 0). 

From the foregoing it follows that each 
real polynomial (i.e., such that all its coeffi- 
cients are real) can be expanded into (real) li- 
near and quadratic factors (e.g., see the expan- 
sion (14.1.8)). Indeed, the polynomial P (z) = 
a 0 z n + %z n-1 + . . . + a n _iZ + a n has n (real 
or complex) roots z 1? z 2 , . . ., z n , which nul- 
lifies it: P (%) == P (z 2 ) = . . . = P (z n ) — 0. 
Here P (z) can be expanded into a product 

P (z) = a 0 (z — zj)(z — z 2 ). . • (z — z n ). 

But if z 0 = x 0 + iy 0 , where y x =£ 0 is a com- 
plex root of the polynomial P (z) and z% = 
x 0 — iVo is a root complex-conjugate to z 0 , 
then 

(* — z 0 ) (z — z%) 

= (z — x 0 — iy 0 ) (z — x 0 + iy 0 ) 

= l(z — x 0 ) — iy 0 ] [{z — x 0 ) + iy 0 ] 

= (* — x 0 ) 2 — (iy 0 ) 2 = (z — x 0 ) 2 + yl 

= z 2 — 2x 0 z + (xl + yl). 

Thus, to each pair of complex-conjugate roots 
Zq , z% of the polynomial P (z) there corresponds 
the quadratic factor z 2 — 2x 0 z + (x\ + yl) in 
its expansion, and to a real root z j = x x 
+ iO = x x (if only such a root exists) there 
corresponds a linear factor z — z x = z — x x 
of the expansion of P (z). This reasoning 
proves the required assertion. For example, the 
inexperienced reader may consider the ex- 
pansion 

x* -f- a 4 = x* + 2 a 2 x 2 + a 4 — 2 a 2 x 2 

= (x 2 + a 2 ) 2 —(V2ax) z 

= [ (x 2 + a 2 )+/ 2 ax] [ (x 2 + a 2 ) — 1^2 ax] 

= {x 2 -\- y^2 ax-\-a 2 ) (x 2 — 1^2 ax-\-a 2 ) 

to be accidental and artificial (how could we 
know that 2 a 2 x 2 should have been added to 
and 2a 2 x 2 subtracted from the expression x 4 + 
a 4 ?). But the knowlegeable reader will under- 
stand that this expansion follows with neces- 
sity from the general fact formulated above of 
expandability of real polynomials and from 
the formula 

x = y r — a 4 = <2 y~ — 1 

for the roots of the polynomial x 4 + a 4 (see 
Exercise 14.1.4). 
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The fact that in the region of com- 
plex numbers the square root can be 
taken of any (positive or negative, real 
or complex) number and that each 
quadratic equation has here two roots 
undoubtedly rejoices us since it proves 
that the set of complex numbers is, in 
a certain sense, “constructed more sim- 
ply” than the set of real numbers. 
However, at least we could have expect- 
ed this result; indeed, the imaginary 
unit i was especially introduced to 
make it possible for us to take the square 
root of — 1 (or to solve the quadratic 
equation x 2 + 1 = 0). Of course, here 
we have obtained more than we had 
reason to expect (that is, the existence 
of not only one quadratic root ]/" — 1, 
previously nonexistent, but of all pos- 
sible quadratic roots of all complex 
numbers; the solvability of not only the 
especially chosen equation x 2 + 1 = 0, 
but of all quadratic equations), but 
this just indicated how successful the 
introduction of ]/ — 1 = i was. How- 
ever, the fact that application of com- 
plex numbers z = x + iy permits taking 
any roots of all (both “old” and “new”) 
numbers and solving algebraic equa- 
tions of the third, fourth, fifth, and 
any other orders so that to make all 
equations solvable we have no need 
to introduce any new numbers and 
can manage with the complex numbers 
we used to solve quadratic equations, 
this fact is certainly a surprise. All 
this suggests that “there is something” 
in complex numbers, that they mean 
more than just the taking of roots of neg- 
ative numbers, that complex numbers 
are not at all accidental and can be 
useful in many areas of mathematics 
and mathematized natural sciences. 

These hopes have materialized. 


Exercises 

14.1.1. Let z = x + iy and / (z) = u + 
iv. Find u and v as functions of x and y if the 
function / (z) has the form: (a) / = z 3 — 3z +1, 
(b) / = z/( 1 + z 2 ), (c) / = 

14.1.2. Prove formulas (14.1.6a) and 
(14.1.6b) and the validity of de Moivre’s for- 
mula (14.1.6) for any (real) n. 


14.1.3. Prove that n values of the nth 

root of an arbitrary complex number c (n 
roots of an nth order equation z n — c = 0) 
are represented geometrically by n points of 
the complex plane, which are the vertices of 
a regular polygon of n sides. ____ 

14.1.4. Write all the values of y^— 1. 
Using the above formulas factorize the poly- 
nomial z 4 + a 4 (use real factors). 

14.2 Raising a Number to an 
Imaginary Power and the Number e 

We pose the problem of determining an 
imaginary power. How can we find the 
number 10 i ? Can we multiply 10 into 
itself i times, that is, ]/ — 1 times? At 
first glance this question seems sense- 
less. However, the question of the 
result of multiplication of 10 into itself 
— 3 times or 1/2 time seems at first 
also senseless, since literal application 
of the rule of raising a number to an 
integral positive power (a n = aa . . . a) 

n times 

is impossible here; yet the num- 
bers 10- 3 (= 0.001) and 10 1 / a (=V 10 ~ 
3.16) undoubtedly exist. 

Let us recall the logical sequence of 
determination of negative and fraction- 
al powers of numbers, the example being 
inspiring. First, we only define raising 
a number to an integral positive power 
n: 

a n = aaa ... a. 

n times 

From this definition — for integral pos- 
itive powers as before— two rules fol- 
low: 

n+m times n times m times 

a" +m = aa ... a = a ... aa ... a 
= a n a m , (14.2.1) 


n times n times n times 



m times 


(14.2.1a) 

(cf. definitions of addition and multi- 
plication of natural numbers on p. 458). 
Generalization consists in that we re- 
quire the rules (14.2.1) and (14.2.1a) 
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be also fulfilled for negative and frac- 


tional powers. 



N 

II 

N 

Then the rule (14.2.1) yields 




a*a ~ 3 — a 4+ (" 3) = a 1 = a. 




! k / n 

and therefore 

0 

9 

/ X 


ta -3 —JL = JL. Figure 14.2.1 

a 4 a? ’ 


and similarly for any natural m 


In a similar way, by virtue of (14.2.1a), 
JL. 3 

• (a 1 / 3 ) 3 = a 3 =a i = a, 
and therefore 

-# 1 / 3 — / a and generally a^ q = q / a. 

Now for any fraction n — plq we have 

, u v/q = ( a i/*>p^ («/ a)* = q /a?. (14.2.2) 

Since for any real (irrational) number 
we can find an arbitrarily close rational 
fraction, the problem of raising a num- 
ber to any real power is completely 
•solved. However, the techniques we 
used to find negative and fractional 
powers are themselves insufficient to 
•determine the imaginary power of a ik 
(a and k being real numbers); for this 
purpose we have to use the data we 
have gained in studying the derivative 
of the exponential function. 

Let us use the fact that 

*e r ~ 1 + r for | r | <C 1 (14.2.3) 

by the definition of the number e (cf. 
.Section 4.8, formula (4.8.2)), the ap- 
proximate equality (14.2.3) being the 
.more exact the smaller | r | 14 - 3 . Let us 
agree now that Eq. (14.2.3) is also 
valid for complex r’s which are small 

14 • 3 The difference a = e r — (1 + r) is 
«of the second order of smallness with respect 
tor: for small |r| the ratio a/r 2 remains “finite”, 
that is, not too small and not too large. 
(Instead of referring to Eq. (4.8.2), we could 
have, which is the same, proceeded from the 
definition (4.8.3a) of an exponential function, 
= lim (1 + k/ri) n , assuming that the 

n-> oo 

-atter formula is also valid for complex (pure 
.maginary, in particular) powers k.) 


in absolute value, in particular also 
for pure imaginary numbers ri whose 
absolute value | ri \ = r is small. In 
order to determine the number e hi , 
with k being any (not small) real num- 
ber, we use the fact that, by virtue of 
(14.2.3), 

e hi/n^, 1 + Aj for (14.2.3a) 

where n , as follows from the notation 
n >> 1, is understood to be a very large 
number (we will consider it to be natur- 
al) so that kin <C 1. 

The point 2 = 1+ kiln (= p (cos cp + 
i sin cp)) of the complex plane has Car- 
tesian coordinates (1, kin) and polar 
coordinates (p, cp) (the absolute value 
and argument of the complex number 
z) , where 

(14.2.4) 

k/n . k 

sin cp = , tancp = — 

(Figure 14.2.1). For n >> 1, that is when 
kin < 1, the (exact) equalities (14.2.4) 
can be replaced by the (approximate) 
relations 

p~l, cp~+, (14.2.5) 

the approximate nature of (14.2.5) in- 
dicates that terms of the (kln)HYi and 
higher orders of smallness are neglected 
(see the text below printed in small 
type). Since, by virtue of de Moivre’s 
formula (14.1.6), 

e hl ~ ^ 1 + ~ IP ( cos <P + i sincp)] n 
= p n (cos ncp + i sin racp) , 
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we have 

(l + Ai) n ~l"[cos (»-£-) 

+ i sin ( n ) ] = cos k + i sin k 

(14.2.3b) 


and consequently 

e hi = cos k + i sin k. (14.2.6) 

This relation can be assumed to be the 
definition of the exponential function 
of an imaginary argument; it follows from 
the supposition (14.2.3a). Formula 
(14.2.6) is known as the Euler formula. 

The Euler formula (14.2.6) helps find 
the number 10* 

_ ^ln 10)i = £lnl0.i^e 2 -3i 

= cos 2.3 + i sin 2.3 ~ — 0.67 + 0. 77z. 


Another method for deriving the Euler 
formula is connected with the expan- 
sion of functions into series. As we 
know (see Chapter 6) 


e x ~ 1 + 



a: 2 . a 3 , 

4! 



(14.2.7) 


this series being convergent for all 
x's, which gives us courage to consider 
the number x in the series to be complex 
as well. Suppose, by definition, 


*** = ! + -§- + 


( lx l , 
2 ! ^ 


(ix) 3 

”~3! 


, Qz) 4 
"+■ 4! 


5! ' 



z 2 | 

2! ^ 4! 

X 3 ' x b 

'3T + ”5f 



(14.2.8) 


But since (see Chapter 6) 

cosz = l— (14.2.9) 


Further we can naturally assume that 
if z = x + iy, then 

e z z= e x+iy = e x e ly = e x (cos y + i sin ,y) 

(14.2.10) 


(cf. Exercise (14.2.1)). 

Formula (14.2.10) can, of course, be 
immediately taken as the definition 
of the exponential function e z of the 
complex argument z = x + iy (com- 
pare with the above reasoning for the 
Euler formula (14.2.6)). This definition 
is justified by that if z = x + iO = x 
then e z becomes the exponential func- 
tion e* of the real variable x, and also 
by the fact that the function (14.2.10) 
retains the basic property (14.2.1) of 
exponential function: 
if z = z 1 + z 2 , where z 1 — x 1 -\- tyi and 
z 2 = x 2 + iy 2 , then 


e z = exp [(as, + x 2 ) + i (y t + y 2 )] 

= e< X4 + X2 > [cos (y t + y 2 ) + i sin (y t + y 2 )] 
= (e xl e X2 ) [(cos y t + i sin y,) (cos y 2 
+ i sin y 2 )] = [e x i (cos y t + i sin y,) ] 

X [e X2 (cos p 2 + i sin p 2 )] = e zl e Z2 . 


By virtue of formulas (14.2.4) and (6.4.6), 
/, . * 2 \ 1/2 1 k* , 

P ■ V 1 + 7^") ~ X+ ~2~^ + --- 

k 2 , 


= 1- 


2n 2 


where the dots indicate terms of order 
higher than the term k 2 /2n 2 ; in accord- 
ance with the (more general) formula (6.4.3), 
we have 


k l + 


= 1 + 7?. 


k 2 


2 n 2 
k 2 


2 n 2 


. = 1 


k 2 


2n 2 


sin x = x — — h-ff • • - (14.2.9a) 

(the series (14.2.9) and (14.2.9a) are 
also convergent at all x’s), we again 
obtain the Euler formula 

e 2 * = cosx + isinx. 


whence, owing to the smallness of the second 
term k 2 /2n for n < 1, it follows that the ab- 
solute value of the number (1 + ki/n) n (~e hi ) 
can be assumed to (approximately) equal 1. 
Similarly, we can also verify that it is not a 
big mistake if we assume that the argument 
of the number (1 + kiln) equals kin (see Exer- 
cise (14.2.2)). 


30-0946 
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Exercises 

14.2.1. Prove that the relation e x+ iy ~ 

(l for n ^ 1 (that is, e x+i y = 

{ x+iy ' n \ 

lim ( 1 -I I ) means the same as 

n-*oo ' n J ' 

formula (14.2.10). 

14.2.2. Prove that in the formula 1 + 
kiln = p (cos cp + i sin <p) for 1 we have 
<p = kin 4- 1 . . ., where the points indicate 
terms of the 1 In 2 and higher orders of 
smallness. 

14.2.3. Prove that for any complex (and 
not only for pure imaginary) number A z small 
in absolute value, we have e* z = 1 + A z + 

. . ., the points indicating the terms whose 
absolute values are of order | Az | 2 and higher. 


14.3 Trigonometric Functions and 
the Logarithm 

The Euler formula (14.2.6) reveals a 
deep inner relation of periodic functions 
cos x and sin x to the exponential 
function. At first glance the exponential 
function a x has nothing in common with 
periodicity: for a > 1 the value of a x 
monotonically increases with increas- 
ing x , while for a < 1 it always de- 
creases; there is no periodicity at all. 
But if a = — 1, then the powers of the 
number a assume the following values: 

-1. +1, -1, +1, -1, +1, . . 

(14.3.1) 

that is, they change periodically 14 - 4 . 
If we write 

— 1 = cos n + i sin n = e in , 

then we can consider the power n in 
the notation ( — l) n to be any real 
number: 

( — l) n = ( e i7l ) n = e inn = cos { nn ) 

+ i sin (nn ) . 

Here (generally speaking, complex) 
quantity ( — l) n assumes various values 

14 • 4 We have just limited ourselves to 
natural (integral positive) values of the power. 

If we let n also assume integral negative 
values and vanish, we will infinitely continue 
to the left the periodic sequence (14.3.1) of 
numbers ( — l) n . 


whose modulus is 1 and changes pe- 
riodically with n . The value of a x 
changes in the same manner, where 
a = cos cp + i sin cp = e i( P is an arbi- 
trary complex number with unit 
modulus: in this case 

a x = {e l( v) x = e l ® x = cos (cp^) + £jsin (cpx) . 

Now, if a = p (cos cp + i sin cp) = pe i( t> 
is a complex number whose modulus p 
is not unity (p > 0 and p ¥= 1), then 

a x = (pe i( P) x = p x e^ x 

= p x [cos (qxr) + i sin (cpa:)] (14.3.2) 

can be expanded into a product of two 
cofactors: one of them (which deter- 
mines the absolute value of the number 
a x ) will monotonically increase or mo- 
notonically decrease with increasing x y 
while the other equal to e i(px (it is re- 
sponsible for the argument of the num- 
ber a x ) will change periodically. If 
pCl, then the real part and the coef- 
ficient of the imaginary part of a x 9 
that is, the expressions p* cos ((px) and 
p* sin (cp.r), will simulate damped oscil- 
lations (see Section 10.4 and, in partic- 
ular, Figure 10.4.2). Later (see Sec- 
tion 17.1) we will see that the relation 
between the function a x and damped 
oscillations is by no means accidental. 

The Euler formula (14.2.6) enables 
constructing many other important 
functions of a real, imaginary, or com- 
plex variable. Let us begin with re- 
writing this formula and expressing 
trigonometric functions via an expo- 
nential function, rather than an expo- 
nential function via trigonometric ones. 
From 

e'w = cos cp + i sin cp, (14.3.3) 

it also follows that 

e~ i( P = cos ( — cp) 4- i sin ( — cp) 

= cos cp— i sin cp. (14.3.3a) 

Adding (14.3.3) and (14.3.3a) term-by- 
term and then subtracting (14.3.3a) 
from (14.3.3) we readily obtain 

e i<P + £ ,-i<P . g*P_e-i<P 

COSCp^rr — , Sinq>=: 777 . 

(14.3.4) 
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The corollaries (14.3.4) of (14.3.3) are 
also often called Euler formulas . 

Note that for real cp the terms on the 
right-hand sides of (14.3.4) are certain- 
ly real (they are the cosine and the 
sine of an angle): indeed, substituting 
the expansion (14.2.8) and a similar 
one for e i( P and e~ i( P in these expressions 
we again obtain formulas (14.2.9) and 
(14.2.9a). There is another approach 
to readily convince ourselves that the 
expressions obtained are real. Let us 
replace i by — i in the expression z = 
P + iQ for a complex number z (P and 
Q being real); we obtain a complex- 
conjugate number z* = P — iQ. If z 
did not change here, that is, z* = z, 
then Q = 0 and z is real. But we can 
see that (for real (p) the right-hand 
sides of (14.3.4) do not change on sub- 
stituting — i for £, and consequently 
they are real. 

Further, using expressions (14.3.4) 
for cos cp and sin cp we can also deter- 
mine trigonometric functions of the 
imaginary argument via exponents. We 
replace in (14.3.4) cp = ix ]), assuming yp 
real, and so cp is pure imaginary. We get 


cos (ii|)) 


2 


sin (i^) = 




(14.3.5) 


Thus we are back again to the exponen- 
tial functions of the real argument yp 
and more accurately to certain com- 
binations of exponents we are going to 
discuss in the next section of this 
chapter. 

In discussing the functions cos (tip) 
and sin (h|)) we can solve still another 
problem. For example, assume yp = 1 . 
Simple calculations yield 


cos i = 


e 1 -f- e~ x 
2 


(and sin i ~ i 


_ 2.72+0.36 

(14.3.6) 

2 - 7 _ 2 -°- 36 =l,l8i) . We 


pose the question: What is the angle 
whose cosine equals 1.5? Before we 
could have answered that there is no 


such an angle since the cosine of any 
angle does not exceed unity. Here it 
is tacitly implied that the angle is a 
“real” angle, that which can be drawn 
or measured with an instrument. But 
now that we have introduced imaginary 
angles we can obtain a definite numer- 
ical solution: the equation 

cos q> = 1.5 (14.3.7) 

has the (approximate) solution cp ~ i. 
Obviously, the value cp ~ — i is also 
a solution of Eq. (14.3.7); moreover, 
there are infinitely many solutions 
which differ by multiples of 2n. Thus, 
before we had no solutions of Eq. 

(14.3.7) but now we can consider them 
to be infinitely many: 

cp ==: ziz i “I - 2kn , k = 0, ztl, +2, .... 

(14.3.8) 

The situation here is quite similar 
to that we encountered in the 
case of algebraic equations. When we 
used only real numbers, the quadratic 
equation x 2 — n = 0, with n > 0, had 
two solutions (roots), x — ±]/"ra, while 
the equation x 2 + n = 0 had no solu- 
tion. But when we used complex num- 
bers, the second equation also had two 
solutions, x = zh i^ n. Similarly, the 
equation cos cp = n, for n ^ 1, had 
infinitely many solutions (for example, 
to the value n = 1/2 there correspond 
angles 9 = zizJi/3 + 2 fcrc, k being any 
integral number), while for n > 1 the 
equation had no solution; but when cp 
is complex, the equation has infinitely 
many solutions. 

The problem of solving equations can 
be formulated as that of determining 
an inverse function. Let us consider a 
(“direct”) function t — x 2 , the deter- 
mination of its inverse reduces to 
solving the equation x 2 = t , where t is 
given and x must be found. Here for 
real variables the original function 
exists for all values of the argument x , 
while the inverse function only exists 
for the argument t ^ 0; in the complex 
plane the inverse function is also de- 
fined for all t. Similarly, the function 


30 * 
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cp = arccos s , the inverse of s = cos cp, 
also exists for real variables only in the 
interval — 1 ^ s ^ 1 (cf. Section 4.11); 
while for complex numbers the function 
is defined for all s . 

The situation proved to be the same 
with the logarithmic function z = In w, 
which is defined as the inverse of the 
exponential function w = e z . We know 
that real numbers do not always have 
logarithms, for example, negative num- 
bers have no logarithms. Logarithms 
of complex numbers, as we will see 
later, are always defined. Note, finally, 
that all the three inverse functions, 
~x = Y t, cp = arccos s, and z = In w, 
are not single-valued, the nature of this 
being different; the reasons for this 
^difference will be discussed later. 

We rewrite formula (14.2.10) thus 

<e x+iy =p (cos y+ i sin y ) = pe ly , (14.3.9) 

where p = e x , that is, x = In p. Since 
the equation w — e* can be written 
as z = In w, from (14.3.9) it follows 
that if w = p (cos cp + i sin 9), then 

z = In w = In p + jcp 
= In | w |+ i Arg w. (14.3.10) 

This relation can, of course, be con- 
sidered as the definition (unknown to 
us before) of the number In w: we “de- 
rived” (14.3.10) from (14.3.9) on the 
assertion that e z = w means the same 
as z — In w, which is only valid for 
real z and w and not for complex (or 
aven for negative real) w for which 
In w had no meaning before. The ex- 
pediency of the definition (14.3.10) is 
connected with that for cp = Arg z = 0, 
that is, when z is a real positive number , 
formula (14.3.10) leads to the usual 
notion of logarithm (when introducing 
logarithms of complex numbers we have 
just to refresh our knowledge and not 
to start from the very beginning). 
Besides, by (14.3.9) here we always 
^for any w) have e ln w = w. 

Expressing the numbers p (that is, 
| w |) and cp (that is, Arg w) via the 
real part and t the coefficient of the 
imaginary part of the complex num- 


ber w = u iv: p = \ u? \ = 

Yu 2, + v 2 , cp = Arg w = arctan (v/u), 
we write (14.3.10) thus 

In ( u + iv) = In Y u 2 + v 2 + i arctan ( vlu ) . 

(14.3.11) 

We have finally arrived at a new 
notion of the logarithm of a number, 
a notion wider than we learned at 
school. Formula (14.3.10) (or (14.3.11)) 
enables us to find, for example, the 
logarithm of a negative number. Indeed, 
we can write, say, 

—5= 5x (—1)= 5 (cos ji + i sin it) 

whence it follows that 

In (—5) = In 5 + in 

~ 1.6 + 3.14t. (14.3.12) 

Since (cf. Section 4.9) log 10 x = 

In z/ln 10 — In x/2.3, we have 

log 10 (~5) - (1/2.3) In ( — 5) 

~ 0.7 + 1.37 i. 

If the modulus p of the complex 
number z is unity, that is, if z = 
cos cp + i sin cp, then In z — cpj = 
i Arg z. This relation of logarithms of 
complex numbers and their arguments 
is responsible for that the properties 
of the arguments of complex numbers 
are close to those of logarithms: for 
any two (complex) numbers u and v 
we have 

log ( uv ) = log u + log v y 
log =logu — logy, 

Arg (uv) = Arg u + Arg v, 

Arg = Argi^— - Argy, 

that is, when multiplying complex 
numbers their arguments (and loga- 
rithms) are added, and when dividing 
they are subtracted. (Of course, here 
the symbol log u can also mean the 
logarithm to any base, since all loga- 
rithms are in proportion.) 
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Note that the angle cp, which is the 
argument of a complex number and 
enters into the formulas, can be re- 
placed by the angle cp + 2fcrt, where 
k is any integer. Therefore, the loga- 
rithm of any number is not single- 
valued, but infinitely many-valued . For 
example, the complete solution of 
(14.3.12) is 

In ( — 5) = In 5 + i (n + 2fcrt), 
k — 0, zb 1 , ±2, . . .. 

The same refers to the logarithm of any 
positive number. For example, 

In 5 = 1.6 + i x 2&jt, 
k — 0, +1 , ±2, . . ., 


14.4* Trigonometric Functions of 
a Purely Imaginary Independent 
Variable. Hyperbolic Functions 


Let us return to formulas (14.3.5) for 
trigonometric functions of the imag- 
inary argument: 


e * + e -* 

COS (l\j)) = — 

e yb__ e -ty 

sin (i(p) — i ^ 


(14.4.1) 


The combinations of exponents e^ and 
e~^ contained in the formulas have 
special names and notation. The former 
is called the hyperbolic cosine and the 
latter the hyperbolic sine of the argu- 
ment i|), and denoted by 


or 

In 1 = i X 2/br, k — 0, -4-1. ~4~2. ..., 
since 

1 = cos 0 + i sin 0 = cos 2n 
+ i sin 2n = . . . = e° = e 2ni 

■ ^4 Jit — 

Later we will return to the question as 
to why the logarithm is not single- 
valued, why every number has not one, 
but infinitely many logarithms. 

Note, finally, that logarithms enable 
us to raise to any, (complex), power of 
any (complex) number. For example, 
let us find the number i % (what will 
we obtain on multiplying the number i 
into itself i times?). Obviously, 

i = cos ~ + i sin = e 

and, consequently, 

ji — (erti/2)i __ e (Ui/2) i _ g-rt/2 

~2.7 _1 - 57 = frr ~0.21 

2.7 1 * 57 

(the number is even found to be real). 
Exercises 


14.3.1. Let u = p (cos cp + i sin <p) and 
z = x + iy. Write all the values of u z . 

14.3.2. (the jocular problem). What is 
larger: e e or i*? e i or i e ? 


cosht|) = — , sinh yp — * 

(14.4.2) 

(The letter h reminds us of hyperbola.) 
Thus, 

cos (hf)) = cosh if), 

sin (n|>) = i sinh yp. (14.4.3) 

Some properties of hyperbolic func- 
tions copy the properties of (usual 
or circular) trigonometric functions, re- 
producing them sometimes in a some- 
what distorted form. For example, 

f rr\3Wi *1 1, 

the function cosh ip is even: cosh ( — \|)) = 
(i e ~ ^ + e^)/2 = cosh yp, while the func- 
tion sinh yp is odd: sinh ( — ^ =. 
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(, e ~ — e't>)/2 = — sinh ip- On the other 
hand it is obvious that cosh (0) = 1 
and sinh (0) = 0 (compare the curves 
of the hyperbolic sine and cosine repre- 
sented in Figure 14.4.1). Figure 14.4.1 
also presents the curves of the hyper- 
bolic tangent: 


tanh ip — 


sinh — e ^ 

cosh ip gii’-f.g-il’ 


clearly, the function tanh ip is odd 
(tanh (— i ip) = —tanh ip) and tanh (0) = 
0. 14 - 5 Note also that to the well-known 
formula cos 2 <p + sin 2 cp = 1 there cor- 
responds the following formula of “hy- 
perbolic trigonometry”: 

cosh 2 ip — sinh 2 ip=l. (14.4.4) 

Indeed, 

cosh 2 ip — sinh 2 ip = ( g ) 

( e *—e~* \2 _ e 2^ +2 + e - 2 1 l) 

~l 2 ) - 4 

g 2H>_ 2+g -2M> _ 4 


The expansions into series 

'♦“ 1 + T+f + ! + £- + 


e-^— 1- 


2 ! 

if 2 


4! 


■ + 


2 ! 


. 

3! + 4! + * ‘ * 


and formulas (14.4.2) yield 


cosh ij) = 1 + 


^2 
2i r 


i , 

4! ' 6! ^ ‘ ‘ ‘ ’ 


(14.4.5) 


<^h3 

sinh ip = ip -)- -j- 


+ -JT+ 


(compare with the expansion (14.2.9) 
and (14.2.9a) into a series of trigono- 


14 • 5 We know that the (ordinary) cosine 
and sine (of a real variable) are bounded: 
| cos <p | c 1 , | sin <p | c 1 , while the tangent 
can assume arbitrarily large values. But the 
hyperbolic cosine and sine are not bounded 
since as is readily seen cosh oo as \p-»- oo, 
sinh \|?-»- oo as -> oo; while tanh \|? is 
bounded: | tanh \p | < 1 (and obviously 
lim tanh = 1). 

\J?-*-oo 


metric functions), whence it follows, in 
particular, that 


sinh ij; ^ i|), cosh ^ ^ 1 
tanh of) ~ ^ 


j?2_ 

2 ’ (14.4.6) 


for small yp (we have here neglected 
terms of the third and higher orders of 
smallness with respect to i|>). Further, 
it is obvious that 


(sinh o|))'= ( g ) 


2 

e^+e-y 


— cosh oj), 


(cosh i|))' = ( 


2 j 


(14.4.7) 




= sinh t|), 


whence, with account taken of (14.4.4), 
we have 

(tanh tp)' = (S|) 

(sinh yy cosh ip — sinh ip (cosh ip)' 
cosh 2 ip 

cosh 2 ip— sinh 2 ip _ 1 

= cosh 2 ip — cosh 2 ip 

(compare with the formulas for deriva- 
tives of trigonometric functions in Sec- 
tion 4.10). 

Of no less surprise is the resemblance 
of the weil : Known '(‘"Sihftal”'; fnmmlnfo 
of addition for trigonometric functions: 

sin (a-)- P) = sin a cos P + cos a sin p, 

cos (a + P) = cos a cos p — sin a sin p, 

. . . u . ta n a + tan [3 

tan (a + P) — \ _ t an a tan p 

and similar formulas for hyperbolic 
functions 


sinh (u + v) = sinh u cosh v 
-f cosh u sinh v, 
cosh (u + v) = cosh u cosh v 


-(- sinh u sinh v, 

. sinh (n+O 
tanh (u + v)— cosh + v) 


sinh u cosh v -f- cosh u sinh v 
cosh u cosh n+sinh u sinh v 


(14.4.8) 
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sinh u! cosh a -f- sinh p/cosh u 
~ 1 (sinh u sinh v)/(cosh a cosh u) 

— tanh u + tanh u 
1 + tanh u tanh v 

(the last of these formulas was ob- 
tained by dividing the numerator 
and denominator of the fraction 
sinh ( u + a)/cosh ( u + v) by cosh u 
cosh u; as to the first and the second 
formula see Exercise 14.4.1). From 
Eq. (14.4.8) it follows that 

sinh 2u = 2 sinh u cosh u, 

cosh 2 u = cosh 2 u + sinh 2 u , (14.4.9) 


tanh 2 u — 


2 tanh u 
1 + tanh 2 u 



Figure 14.4.2 


sinh (—'ll)) = — sinh o|), and tanh ( — ty) = 
— tanh'i|)); the equalities cosh (0) = 1, sinh (0) 
= tanh (0) = 0; and some others. 14 - 6 


(we have obtained these formulas by 
setting u = v in (14.4.8); all formulas 
(14.4.4)-(14.4.9) resemble very much 
those you already know, don’t they? 


The names of hyperbolic functions cosh i|) 
and sinh \1) are connected with the following. 
Ordinary (circular) cosine and sine can be de- 
fined as the abscissa and the ordinate of a 
point M of a unit circle x 2 + y 2 = 1 (Fig- 
ure 14.4.2a); the argument here is the angle 
L.AOM — (p, which can be considered as a 
doubled area of sector S aom of the circle (eur 
circle has the radius 1). Similarly, the hyper- 
bolic cosine and sine can be described as being 
the abscissa and the ordinate of a variable 
point M of a unit hyperbola x 2 — y 2 — 1 
(Figure 14.4.26), where the argument is the 
hyperbolic angle 'll), which is the doubled area 
of sector AOM of the hyperbola (see Exer- 
cise 14.4.5). Figure 14.4.2 also presents geo- 
metrically the circular and hyperbolic tangents: 


tan cp 


sin cp MP AT _ 

cos cp OP O A 

(see Figure 14.4.2a) 


tanh \[) = 


sinh \() _ MP 
cosh ip OP 


AT 

OA 


— AT 


(see Figure 14.4.26) 


(note that in both cases OA = 1); the seg- 
ment A T is called the tangent line, and the 
segments MP and OP the sine and cosine 
lines. Many properties of hyperbolic functions 
follow from the foregoing: formula (14.4.4) 
(the relations cos 2 cp + sin 2 cp = 1 and cosh 2 aj> 
— sinh 2 \J) = 1 are no other than the equations 
x 2 + y 2 — 1 and x 2 — y 2 — 1 of the circle 
and the hyperbola); the even nature of the 
function cosh and the odd nature of the func- 
tions sinh \1) and tanh-ip (from Figure 14.4.26 
it is readily seen that cosh ( — 'll)) = cosh \p, 


We can now look from a new angle at 
the relation (14.4.3) of circular and 
hyperbolic functions. Previously, the 
sine and the cosine of angle q) were de- 
fined as the segments in a circle of 
radius 1 (see Figure 14.4.2a) or as the 
ratios of the sides of a right triangle 
with acute angle cp. But these defini- 
tions tell us nothing of, say, what sin i 
or cos i is. What is the strange angle 
of magnitude i ? How can we turn through 
such an angle? How can we construct 
a right triangle with such an angle? 
The last questions, of course, remain 
without any answer, but the values 
of sin i and cos i can quite possibly be 
found with the help of Euler’s formulas. 
Note that the series (14.2.9) for the 

a cp 2 , cp 4 

cosine, cos cp = 1 — ^ ~r ^ •••> 

contains only even powers of angle cp; 
therefore, on substituting cp = i (or, 
generally, cp — tyi, where is a real 

number) all the terms of the series 

remain real: the cosine of a pure imag- 
inary angle is a real number. The 

series (14.2.9a) for the sine, sin cp = 

(p — + . . ., on substituting cp = i\ p 


14 • 6 Note that formulas (14.4.8) for addi- 
tion and the definitions (14.4.2) for hyperbolic 
functions can also be derived using the geometric 
representation of hyperbolic functions pre- 
sented in Figure 14.4.26 (see, for example, 
V. G. Shervatov, Hyperbolic Functions , Heath, 
Boston, 1963). 
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yields a pure imaginary expression and 
whence the coefficient i appears in for- 
mula (14.4.3) relating sin (n|?) and the 
real function sinh i|). 

The analogy between trigonometric 
and hyperbolic functions can also be 
seen when we discuss an important 
topic on differential equations. We 
know that the differential equation 

^-=-kx (14.4.10) 

for k > 0 admits of a general solution 


m 



x = A cos co£ + Z? sin co£, 
co = Yk (14.4.11) 

(see Section 10.2). On the other hand, 
the equation 

(14.4.10a) 


where k is also greater than zero, as we 
have already seen (see Section 13.10), 
has particular solutions x = e^ hi and 
x = e-V hi , so that its general solution 
can be written in the form 

x^CeV^t + De-V^K 

where the constants C and D are arbi- 
trary. If we write this solution as 


x=(C + D)} 


elTZt+e-V** 

ft 2 


+• (C-D) 


eV’bt g-V'kt 


2 


and denote C + D = A, C — D = B, 
we, by virtue of definitions (14.4.2), 
obtain a general solution of (14.4.10a), 
which is similar to the general solution 
(14.4.11) of (14.4.10): 

x—A cosh c ot+B sinh c ot, where co = . 

(14.4.11a) 


The relation of trigonometric and 
exponential (or hyperbolic) functions 
sheds new light on the initiation, in 
similar conditions, of periodic motions 
(oscillations) and aperiodic motions. 
Let us consider a ball lying on a cir- 
cular top of a hill (Figure 14.4.3a). 
On the very top the force of gravity is 
com jnbteiy 1 naianceki 1 ny ‘ ’me reaction 


of the base of the hill, that is, the com- 
ponent of the force of gravity, which 
is tangent to the surface of the hill and 
tends to move the ball, is zero. However, 
once the ball moves even slightly, say, 
to the right, a force F begins to act 
on it in the direction which increases 
the motion, the force growing in pro- 
portion to the increasing displacement x 
(we can consider the force to be pro- 
portional to the displacement: F — kx 
for k > 0). By Newton’s second law (see 
Section 9.4) the equation of motion of 
the ball is 

m-^- = F=kx. (14.4.12) 

The solution to this equation (cf. Eq. 
(13.4.10a)) has the form 

x = ae^ m be ^ \ (14.4.13) 

or, which is the same, 
x = A cosh -p B sinh e ot, where 

®=]/ (14.4.14) 

The existence of a solution that 

grows exponentially 14 - 7 z/ x = 
indicates that the position of the ball 
on the top of the hill is unstable . We 
can expect the initial conditions to be 
generally such that in (14.4.13) a ^ 0 

14 • 7 Or the existence of the solutions Y x = 
cosh co£, Y 2 = sinh w t (growing exponential- 
ly as well). It is readily understood that 
as \|? grows the functions u = cosh \|? and u — 
sinh \|? grow by the exponential law: 

_ cosh 'll? _ . sinh \|? 1 

lim — = lim nr— = — . 

aj)-*oo ^ 
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and b = 7 ^ 0 (it is quite unlikely that the 
initial conditions ensure the strict 
observation of the equality a = 0 ). 
Consequently, in a certain period of time 
the first term on the right-hand side of 
(14.4.13) will necessarily become rather 
large and the ball will go down. 

Now we consider another (quite op- 
posite in a certain sense) situation. Let 
the ball lie at the bottom of a well 
which has oval walls (Figure 14.4.3b). 
When the ball is displaced a distance x 
from the equilibrium position, a force F 
appears which is directed toward the 
bottom of the well, the “returning” 
force. (Such direction of the force in- 
dicates that the equilibrium is stable 
in the case of the well as distinct from 
the equilibrium of the ball on the hill.) 
Suppose that the force F is proportional 
to the displacement x in magnitude, so 
that F = — kx , where k is positive 
and the minus sign in the formula in- 
dicates that F tends to return the ball 
to its initial position. Then the equa- 
tion of motion of the ball will be 

-kx. (14.4.12a) 

Formally, this equation looks like the 
previous one (Eq. (14.4.12)). Therefore, 
we can use the solution of Eq. (14.4.12) 
by substituting — k for k, since this 
substitution transforms Eq. (14.4.12) 
into Eq. (14.4.12a). Here we obtain the 
solution to the new problem 

x = ae^ 1 + be~^ = aemt 
- \-be ~ i(0t , (o= j/ , (14.4.13a) 

or, with the change of variables, 


(14.4.12a) is bounded for all t , which 
is closely connected with the steady 
state of the ball in the well. 

Imaginary numbers (such as the square root 
of a negative number) were first introduced 
(although in a rather formal manner) by the 
prominent Italian Renaissance mathemati- 
cian Girolomo Cardano (1501-1576). It was 
by no means accidental that complex numbers 
appeared when algebra was making such 
strides or that Cardano was the man who intro- 
duced them. (In ancient times, geometry was 
the center of attention; in the Renaissance, 
it was algebra that paved the way for the rise 
of mathematical analysis in the 17th century.) 
Cardano won himself a place of honor in the 
history of mathematics chiefly for his pioneer- 
ing publication of the formula for solving am 
arbitrary 14 - 8 cubic equation 14 - 9 if 

x s + px -f- q = 0, (14.4.15). 

then 

*“]/ A ~t+I // " (t) +(t) s 

+V z i z 7W^W- 

(14.4 .16)4 

But the seemingly simple Cardano formu- 
la (14.4.16) is tricky. For instance, if (14.4. 15)* 
has the form x s — 3x = 0, that is, if p — — 3 
and q = 0, the equation has the following sim- 
ple roots: x 1 = 0, i 2 =|/' 3 ~ 1.73, x 3 = 
— V" 3 ~ — 1.73, then (14.4.1 6) yields the re- 
markable result x — / y — 1 V/ -/=T= 
f 7 i + V — that is, we suddenly have square' 
roots of — 1. 

The procedure by which square and cube 
roots can be extracted from complex numbers- 
was developed by one of the Cardano’s follow- 
ers, Raphael Bombelli (c. 1530-1572), who 
was able to explain how formula (14.4.16) 
guarantees in all cases a correct way of find- 
ing the roots (three roots, though not alb 
necessarily distinct) of Eq. (14.4.15) (see Exer- 
cise 14.4.4). 

The most influential French mathemati- 


y — A cos co£ + 5 sin co£ , 



(14.4.14a) 


(cf. Eq. (14.4.11); expressions (14.4.13a) 
and (14.4.14a) coincide if we set A = 
{a + b)/ 2, B = {a — b) i/2). Here the 
(real) solution (14.4.14a) of Eq. 


14 - 8 Cardano did not discover it, however; 
he borrowed the formula— (14.4.16)— from, 
Niccolo Tartaglia (whose real name was Nic- 
colo Fontana) (c. 1500-1557) (e.g. seeM. Kline, 
Mathematical Thought from Ancient to Modern - 
Times , Oxford University Press, New York,. 
1972, p. 263). 

14 - 9 On reducing the arbitrary cubic equa- 
tion ax z + bx 2 + cx + d = 0 to (14.4.15),. 
see formulas (1.5.6) to (1.5.6a). 
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rcian of the 16th century, Francois Viete (1540- 
1603), showed that when ( q/2 ) 2 + (p/3) 3 is 
less than zero, that is, when the cube roots in 
Cardano’s formula (14.4.16) are extracted from 
complex numbers, it is possible to reduce the 
right-hand side of (14.4.16) to a simple com- 
bination of trigonometric functions. This 
paved the way to using trigonometric functions 
in the theory of raising complex numbers to 
.a power (even negative and fractional powers), 
that is, to the general formula (14.1.6). But 
in complete form, formula (14.1.6) was first 
^expressed by the French mathematician Ab- 
raham de Moivre (1667-1754). L. Euler, how- 
ever, gave it its modern form. (De Moivre fled 
from his native France to England to escape 
the persecution of the Huguenots, and he made 
.a big contribution to the progress of mathe- 
matics in England.) 

The famous Leonhard Euler made wide 
use of complex numbers. He was the one to 
introduce the notation i for V — 1, and he also 
found most of the results of this chapter. Euler 
was convinced of the validity of the funda- 
mental theorem of algebra, which states that 
>every polynomial equation of degree n (with 
real or complex-valued coefficients) has exact- 
ly n roots (in general, complex-valued), and 
iie made many attemps to prove it, but this 
requires a full understanding of the algebraic 
.and the geometric meaning of complex num- 
bers (see Figure 14.1.1), an understanding that 
the mathematicians of those days did not 
yet have. Jean d' Alembert, the second out- 
standing mathematician of the 18th century, 
came closer to proving the fundamental theo- 
rem of algebra, but he failed to carry it through 
to the end. 

We owe the first proof (proofs, to be more 
*exact, for there were several) of the funda- 
mental theorem of algebra to Carl Friedrich 
Gauss (1777-1855). He also provided the mod- 
ern idea of the plane of the complex variable 
z = x + iy, thus freeing complex numbers 
from the last traces of mystery, even of mys- 
ticism, 14 - 10 and produced many other pro- 

14 - 10 Yet even the much simpler negative 
numbers were first introduced only by Car- 
dano and Viete, and even in Viete’ s time math- 
ematicians viewed them with great sus- 
picion. 


found results in the field of “complex algebra 
and analysis.” For one, Gauss created a non- 
trivial “arithmetic of whole complex numbers”, 
numbers like a + ib, where a and b are whole 
numbers, or integers. In this unusual arith- 
metic, 3 is a prime but 5 is not , since 5 = 
(1 + 2 i) (1 — 2 i). The theory of complex 
numbers can be said to have become a full- 
fledged branch of mathematics only beginning 
with Gauss. Before Gauss, a geometric inter- 
pretation of complex numbers (see Figure 
14.1.1) was presented by Caspar Wessel 
(1745-1818), a self-taught Norwegian-born sur- 
veyor, and the Swiss Jean-Robert Argand 
(1768-1822), who was also a self-educated math- 
ematician and a bookkeeper. However, the 
publications of these two practically unknown 
scientists failed to attract any attention at 
the time, and Gauss arrived at his ideas inde- 
pendently. 


Exercises 


14.4.1. Prove formulas (14.4.8) for adding 
hyperbolic functions. 

2 tanh (t/2) 


14.4.2. Prove that sinh t- 
1 + tanh 2 (t/ 2) 


tanh 2 (t/2) 


' 1 — tanh 2 (t/2) ’ 
and tanh t = 


cosh t — 

i - 

2 tanh (t/2) 

1 + tanh 2 (t/2) * 

14.4.3. What curves are given by the fol- 
lowing parametric equations: 

, v l -* 2 2 1 

( a ) x = tt . y =- 


(b) x~- 


1 + 1 2 

l-M 2 
l —t 2 


y=- 


l-M 2 
2 1 

1 — t 2 


What geometric meaning does parameter t 
have in these formulas? 

14.4.4. How does formula (14.4.16) lead 
to the values 0 and ±\ r 3 of the roots of the 
cubic equation x 3 — Sx = 0 considered in 
the text? 

14.4.5. Prove that cosh of = OP and 
sinh = MP in the notation of Figure 14.4.2a, 
where is twice the area of the hatched hyper- 
bolic sector. 



Chapter 15 Functions the Physicist Needs 


15.1 Analytic Functions of a Real 
Variable 

The idea of a function and of a func- 
tional relationship have undergone much 
change through the history of mathe- 
matics. Different approaches to the 
notion of a function in various periods 
sometimes led to stormy discussions , 151 
and at present are reflected in the differ- 
ent ways in which a function can be 
defined. 

The broadest possible definition of 
a function states that any relationship 
between the element of two sets can be 
called a function. If Mary is wearing 
a white dress, Ann red, and Dorothy 
blue, we can say that the color of the 
dress is a function of the name of the 
girl (or that the dress is a function of 
a certain person). Here the domain of 
definition of the function (or “input,” 
as mathematicians today say) consists 
of three names (or three girls) — Mary, 
Ann, and Dorothy— while the range of 
values of this function (or “output”) 
consists of three colors— white, red, 
and blue (or three dresses). Experi- 
mentally observed dependences of one 
type of physical quantities on another 
type, for instance, electrical resistance 
of a wire on the wire’s temperature, 
are also functions. Finally, one can 
write many formulas of the type y = 
ax 2 + bx + c and y = e~ x 2 / 2 , and all 
these are functions, too. 

But is it advisable to gather all these 
different ideas under one name, func- 
tion? In some respects it is, since this 
stresses the enormous generality of the 
concept of function. This generality, 
however, is attained at a certain price. 

A function given purely verbally 


15 - 1 Best known of all is the controversy 
on this subject (mentioned in Section 10.8) 
involving three outstanding mathematicians 
of the 18th century — Leonhard Euler, Jean 
d’Alembert, and Daniel Bernoulli. (Actually 
they never gave up their original opinions.) 
The real question was: can a graph with a 
break in it or a salient point be considered 
to represent a single function? 


(color of dress as function of name of 
girl) cannot be expressed by a formula, 
so one cannot find the derivative or 
integral of such a function, with the 
result that all the tools developed 
over the centuries by mathematicians 
prove to be useless, since these tools 
cannot be applied to such functions. 

Functions obtained as a result of 
experiments in the form of lists of 
experimental data always prove to be 
given for a certain set of values of the 
independent variable within a certain 
accuracy . 15 - 2 Gan the derivative or inte- 
gral of such a function be found? By 
the very meaning of a derivative (or 
integral) such calculations involve find- 
ing the values of the function in a small 
neighborhood of a fixed value of the 
function and the independent variable. 
But it is impossible to obtain such 
values from a table or experiments 
directly, especially if one takes into 
account the errors of measurements. 
One must assume that there exists a 
smooth functional relationship between 
the quantities that are measured, since 
only then can we use the tools of higher 
mathematics; the experimental data 
or tables are needed to construct such 
a relationship. 

Of tremendous importance is the fact 
that a fundamental theory always leads 
to quite natural mathematical formulas. 
This is also true when we are forced to 
reverse certain relationships (say, solve 
algebraic equations) and arrive at solu- 
tions that are discontinuous, that is, 
a gap between the roots of an equation. 
A physical theory is always constructed 
in such a way that it preserves the 
possibility of using that precious tech- 
nique for studying phenomena, differ- 
ential and integral calculus. 

Thus, functions given by formulas 
possess enormous advantages. It is 
clear, firstly, that specifying a formula 

15 • 2 There are inevitable errors when 
the graph of a function is the only source 
of information about the function (say, plotted 
by a self-recorder). 
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enables calculating a function to any 
accuracy and for any value of the inde- 
pendent variable. In contrast to the 
case where the function is given by 
a table or lists of experimental data, 
repeated calculations by a formula 
always yield the same result — the accu- 
racy can always be'preassigned. Another 
convenient feature of a formula is that 
calculating the values of the function 
for different values of the independent 
variable is reduced to a sequence of 
similar, single-type , operations; this 
feature is especially important when 
computers are employed. A formula 
also enables calculating, to any ac- 
curacy, the difference of two values of 
the function at different (but closely 
lying) values of the independent vari- 
able, which makes possible the numer- 
ical calculation of the derivative. Ad- 
ding the values of a function at adjacent 
points, we can calculate the integral 
of this function, and thus both the der- 
ivative and the integral can be calcul- 
ated to any degree of accuracy . Today, in 
the computer age, such direct compu- 
tation of derivatives and integrals is 
quite simple. 

However, even among functions for 
which one can state an exact method 
for calculating / (x) from the value of x 
there are monstrosities. Take the follow- 
ing function: / {x) = 1 if x is an irre- 
ducible fraction with an even denom- 
inator and / {x) = 0 for all the other 
values of x: lh Z 

1 if x — mfn , where m and n 
are relatively 
f (x) = • prime and n is 

even, 

0 otherwise. 


Just try and construct such a function, 
and you will see that this is impos- 
sible. You would have to build a dense 
fence of vertical segments of unit length 
each erected at points of the x axis that 


15 - 3 Even such an exotic function (but 
still mathematically cerrect) can be expressed 
by a formula, albeit a rather complicated one. 


correspond to numbers of the form 
m!n = ml2n x , where m and n x are 
relatively prime positive integers and 
m is odd; there is an infinitude of such 
numbers on any given finite segment 
of the x axis. The techniques developed 
in this book are not suitable, of course, 
for studying such functions. 

On the other hand, we have, say, the 
“ideal” function y = 1 — x, whose 
graph is a smooth (even straight) line 
and which at each point has a tangent 
line (which coincides with the graph 
of the function itself), that is, the 
function has a derivative everywhere. 
The inverse of this function can also 
be easily found (it coincides with the 
initial function); the same is true of 
the function / ( f (x)) (which is simply 
F {x) = x); it is easy to find the inte- 
gral of this function: 

j f(x)dx=(x-^-)l 

a 

-<<■-»> ( 4 i - 1 ) ; 


and so on. 

Intermediate cases are also possible, 
that is, functions that are not as “bad” 
(or, as mathematicians say, pathologi- 
cal) as the “fence” we described above 
and yet not as “good” as the function 
we have just described. So how is one 
to classify the “good” and “bad” func- 
tions? What functions are considered 
“good”? One approach that clarifies this 
matter considerably is the transition 
to complex numbers and (complex- 
valued) functions of a complex vari- 
able— but first we will try to answer 
these questions while remaining in the 
real domain for the time being. 

One of the main results of Chapter 6 
is the possibility of representing func- 
tions in the form of series: 


/(*) = /( 0 ) 
, /"'( 0 ) 


/'( 0 ) 
1 ! 


r (Q) 
2! 


3! 


X 3 + 


( 15 . 1 . 1 ) 
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{Maclaur ill’s series) or 
f (x) = f (a) + (x — a) 

+ J ^(*- a ) 2 + T - (^-«) 3 + • • • 

(15.1.2) 

{Taylor’s series, which generalizes Mac- 
laurin’s series). These series make it 
possible, if we know the value of a 
function at one point and the derivative 
at the same point, to reconstruct the 
function (with as high an accuracy as 
desired) in the vicinity of the point 
and continue it to another point. If, 
say, formula (15.1.1) holds true, then, 
differentiating it term by term, we 
find the series expansion for the deriva- 
tive 


It is clear that an integral rational 
function (a polynomial) 

/ (x) = a 0 + ctiX + cc 2 x 2 + . • . + d n x n 

(15.1.5) 

is analytic: here / (0) = a 0 , /'( 0) = a 1? 
/"( 0) - 2 !a 2 , /"'( 0) = 3!a 3 , . . . 
. . ., /< n > (0) = n\a n , and /< m) (0) — 0 
for m > n (thus, representation (15.1.1) 
of such a function coincides with 
(15.1.5)). We can say that analytic 
functions, such as e x = 1 -f x + x 2 ! 2! + 
£ 3 /3! + ... or cos x = 1 — x 2 I2\ + 
z 4 /4! — ... or In x = 1 + z + x 2 !2 + 
x 3 /3 + . . . (see Chapter 6), are the 
most natural generalizations of poly- 
nomials: a general analytic function 
(15.1.1) or (15.1.2) is, so to say, a poly- 
nomial of infinite degree. 


r (*)=/' (0) 


f" (0) 
1! 


r (Q) 

2! 


x 2 -f- 
(15.1.3) 


and, if required, for higher-order deriv- 
atives. Similarly, integrating (15.1.1), 
we can write 


$f(x)dx = C + f(0)x+l$La* 


r (Q) 

3! 


x 3 4 


r (Q) 

4! 


x*+ 


(15.1.4) 


It is to functions of this type that the 
techniques developed in this book can 
be applied most successfully. Functions 
that can be represented in the form 
{15.1.1) or (15.1.2) are known as ana- 
lytic functions. lbA 


15 * 4 Of course, formulas (15.1.1) and 
<{15.1.2) can be applied usually only in a lim- 
ited range of variation of x (that is, in a lim- 
ited vicinity of point x = 0 or x = a), a situa- 
tion discussed in detail in Section 6.3. In 
accordance with this, only the concept of 
a function analytic at a point (i.e. such that 
it can be expanded in a power series (15.1.1) 
or (15.1.2) in the vicinity of the point under 
consideration) has a rigorous meaning, a func- 
tion that is anal ytic on an interval is analytic 
at each point of that interval. Here we will 
not dwell on the possibility of term-by-term 
differentiation and integration of formu- 
las (15.1.1) and (15.1.2); this possibility 
usually materializes for the majority of 
functions that a scientist or engineer needs. 


Of course, a function that is analytic at 
a point x — a is “infinitely smooth” at this 
point, that is, has derivatives of all orders 
at this point, since the derivatives are present 
in representation (15.1.2), the existence of 
which is equivalent to the function being ana- 
lytic. However, there is no rule by which the 
existence of all the derivatives of a function 
leads to the analyticity of the function; in 
other words, analyticity is a stronger require- 
ment than the condition that the function have 
an infinitude of derivatives. Indeed, the fact 
that a function has all the derivatives implies 
only that we can write the series on the right- 
hand side of (15.1.2), while for the function 
/ (x) to be analytic it must be represented by 
the series, which means that the series must 
converge (the sum exists) for all x close to a. 
Take, for example, the function y = c -1 /* 2 
(Figure 15.1.1). At x — 0 the expression e -1 /* 2 
has no meaning, but since for small (in abso- 
lute value) x the number 1/a; 2 is extremely large 
and, hence, e~ x i x 2 is very small, it is natural 
to assume that y (0) = 0. The reader can easily 
see than all the derivatives also exist in this 
case: y( n ) (x) = d n y!dx n , with n — 1, 2, 3, . . ., 
which can easily be found by the formulas of 
Chapter 4 (say, dy/dx — e - 1 /* 2 [d ( — 1 /x 2 )/dx]=^ 
(2/a; 3 ) e -1 /* 2 ), but are all zero at x — 0 (see 
Exercise 15.1.1), which means that the right- 
hand side of Maclaurin’s series (15.1.1) re- 
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presenting our function is identically zero, 
while the function proper does not vanish, of 
course, at x 0. Fortunately, functions like 
e~ x ! x% are very rare and are almost never en- 
countered in physical and technical applica- 
tions. 16 - 5 

The requirement that a function be 
analytic is really very strong. For 
instance, an analytic function (within 
the range of applicability of formula 
(15.1.1) or (15.1 .2)) may be reconstructed 
from its values on an infinitely 
small segment (arc) of the graph of the 
function, since knowing f [a) and the 
value of f (x) for values of x that are 
arbitrarily close to a is quite sufficient 
for calculating all the derivatives of 
the function at point a. Therefore, 
analytic functions stand rather far from 
the definition of a function as an arbit- 
rary correspondence between the values 
of x and the values of / (x). However, 
almost all functions given by simple 
formulas are analytic; functions for 
which the analyticity condition is vio- 
lated at separate points (like the func- 
tion e~y x2 considered above) can be con- 
sidered as pathological. 

Of course, in physical problems we 
may also encounter functions that are 
not analytic at some points (say, func- 
tions that have discontinuities or salient 
points), and Chapter 16 is devoted to 
even more remarkable functions (it is 
questionable whether the name “func- 
tion” can even be applied). In the major- 
ity of cases, however, the functions 
that a scientist or engineer needs are 
“good,” or analytic; this statement is 
especially evident when we go over to 
functions of a complex variable (see 
below). 

Exercise 

15.1.1. Suppose / (x) = e - 1 /* 2 for x 0 
and / (0) = 0. What is the derivative of such 
a function at x = 0? The second derivative 
f" (0)? The rath derivative /( n ) (0)? 

15 • 5 The reasons for such strange behavior 
of this function are clarified if we go into the 
complex domain, that is, if we continue the 
function in such a manner that x can take on 
any complex values (in this connection see 
Section 15.2). 


15.2 The Derivative of a Function 
of a Complex Variable 

Let us take a (complex-valued) function 
w of a complex variable z, that is, 
w = f (z), where w = u + iu and z = 
x + iy, with the numbers x and y 
and the functions u. = u (<z. 7 u,\ and_ 
v = v (x, y) being real-valued. 

If a function / (z) can be expressed 
by a formula, then its derivative 
df/dz = lim [/ (z + A z) — / (z)]/Az can 

Az-*-0 

be found by the general rules of Chap- 
ter 4 for (real-valued) functions of a real 
variable. Here are some examples: if 
w = z 2 , then dwldz = 2z; if w = e hz y 
then dwldz = ke kz (here k may be com- 
plex-valued); if w = In z, then dwldz = 
1/z; and so on. There are numerous 
examples of this type; for instance, we 
can form various combinations, such 
as sums (say, z 2 + e : ), products (say, 
zV), and functions of functions (say, 
e 1 ' 1 ), with d(z 2 + e z )/dz = 2z + e\ 
d(z 2 e z )/dz = 2 ze z -f zV , and de z2 ldz = 
2 ze z \ It is quite natural, therefore, 
that all the rules referring to functions 
of a real variable remain valid when 
we go over to functions of a complex 
variable, since these rules are based 
on the general laws of algebra, say, the 
distributive property {a + b) c = ac + 
be , which shows how to remove paren- 
theses in calculations, and similar prop- 
erties. These laws, or properties, do 
not depend on the nature of the quan- 
tities z, Az, w, A w. The rules for finding 
the derivative of / (z) = z 2 or of the 
product of two functions, w^w^, are 
based on the following relationships: 

(z + Az) 2 = z 2 -f 2zAz + (Az) 2 , 

A(z 2 ) = (z + Az) 2 — z 2 = 2zAz + (Az) 2 , 
and, respectively, 

(w ± + A wj (w 2 + Aw 2 ) = ^1^2 
+ w 1 kw 2 + w 2 Nw 1 + A^ x A^ 2? 

A {w x w 2 ) = (w x + A w x ) (w 2 + A^ 2 ) 

— w 1 w 2 — t^ x A^ 2 + w 2 Aw 1 + Az^A^v 

The raising to an imaginary or com- 
plex-valued power was defined by us 
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Figure 15.2.1 


in such a way that the (approximate) 
equality e AZ ~ 1 + A z, where we have 
discarded terms whose order of small- 
ness is greater than that of A z, remains 
valid (see Section 14.2), that is, we 
employed a formula that lies at the 
base of calculations of derivatives of the 
exponential function of a real variable, 
e x . We can therefore be sure that all 
the rules for finding derivatives are 
applicable to functions of a complex 
variable: after we have defined that 
e AZ ~ 1 + A z, all subsequent reason- 
ing is on a stable foundation. 

Note that the variation of the inde- 
pendent variable z of the function / (z) 
can occur in different ways: we can 
vary only the real part of z, that is, 
d x z = d x x and d x y = 0, or only the 
imaginary part of z, that is, d 2 z = id 2 y 
and d 2 x = 0, or the two parts simultane- 
ously (Figure 15.2.1). The function w 
will change differently depending on 
the way we change z, but if dz is small, 
the ratio dwldz, or the derivative, will 
remain the same in all cases. 

Let us verify this using the function 
w = z 2 as an example. We^ave 


w — (x + iy) 2 = x 2 
= u + iv, 


2 ixy 

(15.2.1) 


1Hl = 2x ^L=—2u 

a* dy 


dx 

dv 

dx 


dv 


dv vv 0 


y 2 and v = 2 xy\ here 
(15.2.2) 


Suppose dy = 0, that is, dz — dx. Then* 


dw ~ ir (z + dx) — w (z) — [u (x + dx , y) 


+ iv(x + dx, y)] — [u(x, y) + iv(x , y)) 

^i£ dx + i -t dx - 


Therefore, the derivative in this case* 
is equal to 


W (z) 


dev 

dz 


dw 

dx 


H" i -fa — 2x-\- i2y. 


du 

dx 


(15.2. 3> 


Now assume that dx = 0 and dz = idy. 
Then w + dw = w [x + (y + dy) il 
and, hence, 

dw =% r d y+ i 4r d y> 

dw dw _ . dw i du , dv 

dz i dy dy dy ‘ dy 


— — i ( — 2 y) T“ 2 x — 2x -f- i2y . 

(15.2.4) 

In transformations (15.2.3) and 
(15.2.4) we took into account formulas 
(15.2.1) and (15.2.2) only in the last 
stages, while all the other reasoning 
retains its value for all functions, that 
is, if the derivative of the function 
w = w (z) = u (x, y) + iv (x, y) does, 
not depend on the choice of the incre- 
ment dz of the independent variable,, 
we have 


dw du . . dv . du 

dz dx ' l dx 1 dy 


dv 

d V 

(15.2.5) 


Since both u and v are real-valued func- 
tions, the above relation yields two 
formulas, the famous Cauchy -Riemaim 
equations : 


du dv dv [du 

dx dy ’ dx ~~ dy * 


(15.2.6) 


These equations are satisfied automatic- 
ally if w is given by a formula. Here 
are some examples: 

(1) If w~ z 2 = (x-Hy) 2 , thenu = x 2 — 

v = 2zy,% = %. = 2z, and -£■ = 
-—•—- = 2 y (compare with (15.2.2)). 
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(2) If w=(2 + 3i) z 2 = (2 + 3i)[(x 2 — x and y). The function w = / (z) also 

y 2 ) -j- i2xy] — [ 2(x 2 — y 2 ) — 6.n/] + consists of the real part u and the 


i [3 (x 2 — y 2 ) -\-ixy], then = 

4x-6y and ^ ■= --£-=-6 x-4y. 

(3) If w = e z = e x cos y ie x sin z/, that 
is, u = e x cosy and v = e x siny i then 
du dv ~ , da dv 

dx dy * dy dx 

— e x sin y. 

Note that the Cauchy-Riemann equa- 
tions not only connect u {x, y) and 
v (x, y) but also impose certain con- 
ditions on each of these (real-valued) 


functions. 

Indeed, in 

view 

of (15 

d 

/ du \ 

d 

l dv \ 

. 

d*u 

dx 

\ dx ) 

dx 

\dy) 

, i.e. 

dx 2 


d 2 v 

JLI 

dv \ _ 

JL( 

du 1 


dxdy ’ 

dy \ 

dx ) 

dy \ 



d 2 v 


d 2 u 



i.e. 

dx dy 


dy 2 * 




From this we get = 

. . i i o 2 v X d*v 

similarly, 

words, 

d 2 u . d 2 u ^ d 2 v . d 2 v ^ 

dx 2 ' dy 2 ’ dx 2 ‘ dy 2 


d 2 u , 

and ’ 

in other 


(15.2.7) 


imaginary part v , whereby to each 
value of this function there corresponds 
a point in another plane (with coor- 
dinates u and u). To each z, that is, 
to each pair ( x , y), or to each point in 
the ;rz/-plane, we assign a pair of values 
of u and v , that is, a specific value of w 
(which can also be understood as a 
point in the w-plane, and it is best to 
distinguish between the two planes so 
as to avoid confusion, the complex z 
plane and the complex w plane). If 
we assume that u ( x , y) and v ( x , y) 
are smooth functions, we can even 
obtain the (partial) derivatives duldx, 
du/dx , duldy, dv/dy, dwldx, and dwldy. 
However, even then we cannot be sure, 
generally speaking, that the derivative 

dw du + idv i i cl i 

— = ■— r — — ; — has a definite value, 
dz dx + idy 

that is, a value independent of the 
choice of dx and dy. Thus, the general 
correspondence under which to each 
point z = x + iy there corresponds (or 
is assigned) another point w = u (x, y) + 
iv (. x , y) cannot be used to determine 
dw/dz. 

If we write a formula that expresses 
w as a function of z, such as z 2 , e z , or 


Any function cp = cp (x, y) of two 
treal-valued) variables x and y that 

satisfies = 0 is known as 

harmonic (see footnote 9.5). Thus, if 
w = / (z) = u ( x , y) + iv (x, y) is a 
differentiable function of the complex 
variable z = x + iy, the real and imag- 
inary parts of w (i.e. functions u ( x , y) 
and v («r, y), respectively) are harmonic 
functions. 

At this point it is advisable to return 
to the question of how functions are 
defined or classified, that is, the question 
posed at the beginning of the chapter. 
If we are speaking of functions of a 
complex variable, the most general 
approach would be as follows. The 
complex variable z consists of two 
parts, the real numbers x and y or, 
which is the same, it corresponds to 
a point in the plane (with coordinates 


z 3 e z \ the real and imaginary parts of w, 
that is, the functions u ( x , y) and 
v ( x , y), prove to be interconnected 
via the Cauchy-Riemann equations 
(15.2.6) and each obeys the harmonicity 
conditions (15.2.7). Thus, a formula that 
connects z and w restricts considerably 
the choice of the functions u (x, y) 
and v (x, y), but in return it enables 
finding the derivative dwldz', moreover, 
the fact that w and z are related through 
an analytic function results in other 
important and neat corollaries. In this 
case one usually says w {z) is an analytic 
function of complex variable z. All simple 
functions, such as z 2 and e z , are ana- 
lytic. 

One must understand that analytic func- 
tions of a complex variable emerge when we 
specify the dependence between w and z by 
a formula; this does not mean, however, that 
all functions (that is to say, not all maps of 
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(b> 


Figure 15.2.2 


the complex z plane into the complex w 
plane) are necessarily analytic. For instance, it 

maps each point in the complex 2 plane into 
a point w ( 2 ) in the complex w plane that is 
symmetric to z about the origin (Figure 15.2.2a), 
is analytic; here u = —x and v = — y, and, 
of course, 

d 2 u ( d 2 u d 2 v . d 2 v 

dx 2 ~ r dy 2 dx 2 ' dy 2 

f . . , . d 2 u d 2 u d 2 v 

(since in this case — = — = — = 

d 2 v \ 

dy 2 ~ 0 ) ’ 

du . _ dv du_ dv 

dx dy ' dy dx 


On the other hand, if we take the function 
w = f (z) = z*, which maps point z of the 
complex 2 plane into point w ( 2 ) symmetric 
to z about the real axis (Figure 15.2.26), we see 
that this is not an analytic function. Indeed, 
here u = x and v = — y, and although, of 
course, 

d 2 u , d 2 u d 2 v , d 2 v __ ~ 

dx 2 dy 2 dx 2 ^ dy 2 


we have 

du dv 

dx dy 


1 = ■ 


. i . du 
= —1, but -t— = 
’ dy 


dv 

dx ~° 


(for a function to be nonanalytic, at least one 
of the Cauchy-Riemann equations must not 
be valid). Similarly, the function w = | 2 | 2 = 
zz* is nonanalytic, too. (But is there an alge- 
braic formula that can express the dependence 
of | 2 | 2 on 2 without involving 2 *? Of course 
not!) The same is true of the function w = 
/ (z) — | 2 | = V zz*. Examples of such func- 
tions abound. Take the function w = | 2 | 2 = 


x 2 + y 2 . Obviously, u = x 2 + y 2 and v = 0, 
or du/dx = 2x, du/dy — 2 y, and dv/dx = 
dv/dy — 0, which implies that the Cauchy- 
Riemann equations are not satisfied in this 
case. In general, there is not a single function 
of a complex variable involving 2 * that is ana- 
lytic— not one of such functions can be written 
in the form of an algebraic (or analytical) 
formula. 15 - 6 The same is true of functions of 
variable 2 that involve 1 2 | or (the more so) Arg 2 . 

The properties of analytic functions of a 
complex variable sometimes shed light on the 
properties of ordinary (i.e. real-valued) func- 
tions of a real variable, properties that often 
seem mysterious if one does not resort to com- 
plex variables. As an example, we can take the 
function y = (1 + x 2 )- 1 discussed in Sec- 
tion 6.3 (the graph of this function is shown in 
Figure 6.3.3). The function can be expanded 
in a Maclaurin’s series (formula (6.3.8)), 
which unexpectedly converges only in a limit- 
ed range, | x \ <1, although this function 
possesses no singularities in the real domain. 
The reason for such behavior is that the (ana- 
lvjicl .function w = ( 1 4-, z 2 )- 1 of the complex 
variable 2 becomes infinite at points 2 = =M, 
which lie on the circle | 2 | = 1, and it is these 
singularities that make it impossible to con- 
tinue Maclaurin’s series for this function out- 
side the circle | 2 | < 1. Of similar origin is 
the singularity at point x = 0 of the function 
y = e-1/* 2 discussed in Section 15.1 (the graph 
of this function is shown in Figure 15.1.1). 
For the function e~l/z 2 of the complex variable 
2 the point 2=0 is, as mathematicians usu- 
ally say, an essential singular point ; at such 
a point the function cannot have any value 
at all. Indeed, when we send 2 to zero along 
the real axis, that is, 2 = x, with the (real) 
number x being small in absolute value, we 
find that — z~ 2 = — x~ 2 is a very large (in 
absolute value) negative number and e - i/z 2 = 
e -l/x 2 i s extremely small. But if we send 2 
to zero along the imaginary axis, that is, 
2 = iy, where y is very small in absolute value, 
then — 2 ~ 2 = — (iy)~ 2 = y~ 2 is a very large 
positive number and, hence, e~l/z 2 = el/ 2/ 2 00 

as y -*■ 0. If we send 2 to zero in another 
direction, the function e~i/z 2 can tend to 
any arbitrary value. This means that our func- 
tion can in no way be considered analytic at 
point 2=0 (even no value can be assigned to 
it at 2 = 0 ), and it is also clear that in the 
neighborhood of this point the function can- 
not be expanded in a power series (however, in 
the neighborhood of any other point the func- 
tion can be expanded in a power series). 

Note that, say, the function w = / (z) 
such that w = 0 for x < 0 and w = 1 


15 - 6 This, of course, is not true of such ar- 
tificially constructed functions as w = 2 **, 
or w = ( 2 *)* = 2 , where the asterisk has a 
purely formal function. 


31-0946 
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for x > 0 is not analytic, and the reason 
for this is not so much the jump in w 
at x = 0 as the fact that the quantities 
x and y, that is, the real and imaginary 
parts of z, are not of equal status, so 
to say, in relation to function w. For 
similar reasons the function w = z* = 
x — iy is not analytic either, since in 
constructing z* we approach x and y 
from different “angles,” with the real 
part of z remaining the same and the 
imaginary part of z changing its sign. 

On the other hand, and this is impor- 
tant, if a function incorporates z and 
not x or y separately or z* or | z |, it is 
analytic even if it cannot be expressed 
in the form of a finite combination of 
elementary functions. For instance, the 
function given by a series , say 

“’<’> = 2 w- < 15 - 2 - 8 ' 

n 

is analytic, and so is the function given 
by a definite integral , say 

z 

w(z) = J e~v 2 dp (15.2.9) 

0 

(on the meaning of (15.2.9), where z 
and p are complex numbers, is touched 
on below), and the function that is a so- 
lution of a differential equation, say 

=. z 3 — w 3 , where w = 0 at z = 0, 

dz 

although in all three cases w cannot be 
expressed in terms of z via elementary 
functions. But the example of the 
Cauchy-Riemann equations (or the con- 
ditions of harmonicity for functions 
u ( x , y) and u (x, y)) shows that we 
can make certain statements concerning 
an analytic function even if we only 
know that it is analytic and know noth- 
ing more about its concrete form. 

More than that, the general theory 
of functions of a complex variable 
states that every analytic function can 
be expressed in the form of a series or 
integral . While for an arbitrary (real- 
valued) function of a real variable the 
presence of the first derivative does 


not necessarily mean that the function 
has derivatives of higher orders, for 
analytic functions of a complex vari- 
able this is not so (other functions in 
the complex domain are of no interest 
to us). The fact is that a function w = 
f (z) that has a first derivative w' = 
f (z) automatically has a second deriv- 
ative/" (z) — dw' (z)/dz and, in general, 
all higher-order derivatives (see Exer- 
cise 15.2.2). In addition, in the neigh- 
borhood of a point z = z 0 where the 
function w = / (z) has a derivative we 
have 

f ( z ) = / ( z o) + --fp- ( z — z «) 


+ r^L(z-z,y+ .... (15.2.10) 

This implies that, knowing the function 
in an arbitrary small neighborhood of 
point z 0 (this is sufficient for finding all 
the derivatives of the function at point 
z = z 0 , that is, all the expansion coef- 
ficients in (15.2.10)), we can continue 
the function w = f (z) outside the neigh- 
borhood by defining it via series 
(15.2.10). Moreover, finding the value 
of the function w = f (z) at point 
z — z x in this manner (and in neigh- 
boring points), we can write an expan- 
sion at point z = z 1 similar to (15.2.10) 
and, with the help of this expansion, 
find the values of w at new points. This 
process can be continued indefinitely. 
The very possibility of expanding ana- 
lytic functions of a complex variable 
in a Taylor’s series (15.2.10) justifies 
the use of the term “analytic” for 
functions of both real and complex 
variables. It also suggests that the idea 
of a “reasonable,” or analytic, function 
of a complex variable stands rather far 
from the general idea of an “arbitrary” 
function, the definition of which re- 
quires specifying all its values, that is, 
specifying the entire range of the inde- 
pendent variable. 

The properties of analytic functions of a 
complex variable are widely used in theoretical 
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physics. As yet there is no complete theory of 
elementary particles. But if we assume that 
such a theory can be constructed, that there 
exist formulas (unknown to us) expressing 
the main relationships of this theory, we can 
arrive at meaningful assumptions about the 
theory. It is natural to suppose that the for- 
mulas expressing the functional relationships 
are analytic. Then, even if we do not know 
their exact form, we can make some state- 
ments about the properties of elementary par- 
ticles. For instance, the mass of a particle and 
that of the corresponding antiparticle must 
coincide. As for the properties of unstable par- 
ticles, we can say that the lifetime of such a 
particle and that of the corresponding anti- 
particle coincide, too, although the decay pro- 
ducts may be quite different. There is no way 
of explaining how scientists arrive at these 
far-reaching conclusions. However, it must 
be stressed that the assumption about the 
analyticity of the formulas that have still to 
be discovered plays the main role here. 

The first attempts to apply the operations 
of differentiation and integration to complex 
quantities were made by Gottfried Leibniz 
and Johann Bernoulli. For instance, Bernoulli 

found the simple formula \ ■ 0 a f X - = -^-arctan 
J b 2 + x 2 b b 

in 1712 by using manipulations connected 
with differentiation and integration of com- 
plex-valued expressions, manipulations which, 
although absolutely correct, had not yet been 
sufficiently substantiated. In the same work 
Bernoulli obtained, in the process of his transfor- 
mations, the beautiful formula /- ft n a ~7* N ) n — 

V tan a+i J 

tan na i a nj e( j to t j ien s tiH un- 

tan na + i 

known de Moivre formula (14.1.6). However, 
neither Leibniz nor Bernoulli had fully under- 
stood the situation, as is testified by the de- 
bate between the two on the meaning of the 
logarithms of negative numbers, a debate con- 
nected with the subject of complex-valued func- 
tions and their analysis. Leibniz regarded these 
logarithms as imaginary, while Bernoulli saw 
them as real quantities. The publication in 
1745 of the correspondence between Leibniz 
and Bernoulli immediately attracted the at- 
tention of the most prominent mathematicians 
of the period, Leonhard Euler and Jean d’Alem- 
bert, with Euler taking the side of Leibniz and 
d’Alembert supporting Bernoulli. Euler’s ar- 
ticle entitled “De la contro verse entre Mrs. 
[Messers] Leibnitz et Bernoulli sur les loga- 
rithmes negatifs et imaginaires” (On the Dis- 
pute Between Leibniz and Bernoulli Concer- 
ning Logarithms of Negative and Imaginary 
Numbers), published in 1749, contained the 
modern theory of logarithms of complex num- 
bers set forth in Section 14.3, and Euler em- 
phasized the fact that the logarithmic func- 
tion is many-valued, something which Leibniz 
and Bernoulli had not even suspected. Another 


feature characteristic of Euler was that he 
paid special attention to analytic functions of 
a real variable, which he called “continuous” 35 - 7 . 
Euler was also inclined to believe that these 
were the only functions worthy of the atten- 
tion of mathematicians. Euler’s work abounds 
in results that today belong to the theory of 
analytic functions of a complex variable, in- 
cluding many examples of the expansion of 
such functions into series and of the application 
of these series. However, Euler did not ad- 
vance very far in studying the properties of 
analytic functions, both real-valued and com- 
plex-valued. For instance, he thought that 
knowing the values of a real-valued analytic 
function y = / (x) on an arbitrarily small 
interval over which the independent variable x 
varies permits reconstructing the function on 
the entire real x axis, which is, of course, not 
the case. 

D’Alembert made a fundamental step for- 
ward in the theory of functions when he clearly 
stipulated that both the independent variable 
and the value of the function can be both real- 
valued and complex-valued. On this point 
Euler’s investigations followed d’Alembert’s. 
The analytic function (that is, one fixed by a 
formula) w = f (z), where z = x + iy, was 
written by d’Alembert as u (x, y) + iv (x, y). 
For instance, by differentiating z k , with k 
the constant and z the variable (both complex- 
valued), he found the correct formula (in the 
form of u + iv) for raising a complex number 
to a complex power. He was also the first to 
arrive at equations that are now known as the 
Cauchy-Riemann equations, (15.2.6), in his 
Essay on a New Theory of Resistance of Flu- 
ids (1752), where he considered the motion 
of a body through a homogeneous, weightless, 
ideal fluid. (The relationship between this 
problem of mechanics and the theory of ana- 
lytic functions of a complex variable will be dis- 
cussed below in Section 17.3.) In 1761 d’Alem- 
bert derived Eqs. (15.2.7) from the Cauchy- 
Riemann equations (15.2.6). Joseph Louis 
Lagrange did much to advance all this work 
on the mechanics of liquids and on the theory 
of analytic functions of a complex variable. 

Continuing the work of d’Alembert in hyd- 
romechanics, Euler, in 1755, also obtained 
(independently of d’Alembert) Eqs. (15.2.6), 
to which he attached great importance and 
from which he obtained a number of corolla- 
ries (including results of a geometric nature 
connected with what are now called conformal 
transformations 1 ' 0 • 8 ) . In 1776/1777 Euler (who 


15 • 7 At present this term has quite a diffe- 
rent meaning (notice the way in which Rie- 
mann uses it in his doctoral dissertation, from 
which we quote later on). 

15 • 8 A conformal transformation (of a plane 
or space) is one that preserves angles be- 
tween curves and that, in the neighborhood of 
a point, possesses the properties of a simila- 


31 * 
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was then about 70) launched on a pioneering 
study (and also application) of the 
integrals of functions of a complex vari- 
able (see Section 17.2). However, it was not 
until 1811 that Carl Friedrich Gauss , in a letter 
to the well-known German mathematician 
and astronomer Friedrich Wilhelm Bessel 
(1784-1846), gave the first clear-cut definition 
of the integral of an analytic function of a 
complex variable and an idea of the basic pro- 
perties of such integrals. Incidentally, this 
letter was published only in 1880, when all 
the results set forth in it were already known 
from investigations by Augustin Cauchy, who 
had worked on this theme from 1813 onwards. 

In some textbooks on the theory of func- 
tions of a complex variable Eqs. (15.2.6) are 
called “d’Alembert-Euler equations.” It seems 
to us, however, that the traditional name, 
“Cauchy-Riemann equations,” is more justi- 
fied. The fact is that neither d’Alembert nor 
Euler created the theory of functions of a 
complex variable. For instance, their works 
contain no full definitions of the notions of 
the derivative and the integral of an analytic 
function; neither did they establish the rela- 
tionship between Eqs. (15.2.6) and the very 
fact of the existence of the derivative of a func- 
tion of a complex variable, 

dw w(z-{-Az) — w (z) 

"= 2™ Al ’ 

a derivative that is independent of the way 
in which A z tends to zero. It was Cauchy who, 
in a whole series of lengthy papers (the most 
important of these was a paper on “complex 
integration” submitted for publication in 1825), 
gave a consistent theory of (differentiable, that 
is, analytic) functions of a complex variable. 
Riemann’s brilliant doctoral dissertation of 
1851 entitled “Grundlagen fur eine allgemeine 
Theorie der Functionen einer veranderlichen 
complexen Gr'jsse” (Fundamentals of the Ge- 
neral Theory of the Functions of a Complex 
Variable) played what was perhaps a greater 
role in advancing the theory of analytic func- 
tions of a complex variable. This was the 
first consistent and full exposition of the new 
theory and it laid the foundation for what we 


rity transformation. It is easy to see that the 
transformation, or map, 2 w ( 2 ), of the 
plane of the complex variable 2 on itself spe- 
cified by the analytic function w = f ( 2 ) with 
derivative /' ( 2 ) is a conformal transformation: 
from the fact that w — w 0 ~ f' ( 2 0 ) (2 — 2 0 ), 
where w 0 = f ( 2 0 ), in the neighborhood of a 
point z = z 0 and from the rules of operation 
on complex numbers (see Chapter 14) it follows 
that near z 0 the transformation consists of a 
translation of the complex 2 plane to the 

point w 0 (by the vector z 0 w 0 ), a p-fold dila- 
tion, where p = | /' ( 2 0 ) |, and a rotation 
through the angle cp, with cp = Arg/' (z 0 ). 


today call “the geometric theory of analytic 
functions.” (The dissertation produced a whole 
course of lectures that Riemann delivered in 
Gottingen in the winter of 1855/1856 and the 
summer of 1856, in which he presented an amaz- 
ing wealth of material and many profound 
ideas. 15 - 9 ) 

An altogether different line in the theory 
of functions of a complex variable was begun 
by Karl Theodor Wilhelm Weierstrass (1815- 
1897) of Berlin, a man who was indisputably 
one of the most distinguished mathematicians 
of the 19th century. Unlike Riemann’s rea- 
soning, the Weierstrassian theory of functions 
could quite aptly be called “algebraic.” Weier- 
strass, to whom geometric concepts were 
completely alien and whose thinking followed 
a strictly formal direction, harshly criticized 
Riemann (for whom, at the same time, he had 
profound respect and whose investigations he 
held in the highest esteem) for drawing on 
geometric pictorialness and for a somewhat 
casual attitude toward logic; the modern (very 
strict) criteria of “mathematical rigor” owe 
their origin largely to Weierstrass. In the theo- 
ry of functions Weierstrass’s main tools were 
power series and operations performed on such 
series; he greatly advanced the “formal al- 
gebraic” theory of analytic functions of a com- 
plex variable. For one thing, following an alto- 
gether different approach, he obtained a large 
number of results that Riemann had first stat- 
ed but had not backed up with flawless proofs. 
The rivalry between Riemann and Weier- 
strass can be compared with the relations be- 
tween Leibniz and Newton. Here Weierstrass 
followed Leibniz, and Riemann followed New- 
ton. 

Riemann was a most profound thinker 
in the realm of physics and to a certain extent 
anticipated the general idea of Einstein’s theo- 
ry of gravity and created a consistent mathe- 
matical formalism without which the general 
theory of relativity would have been impos- 
sible. The mutual respect in which these two 
scholars held each other, although their think- 
ing was completely different (they did not 
even understand each other very well!) shows 
that differences in scientific ideology do not 
necessarily lead to personal enmity. The exam- 
ple of how Riemann and Weierstrass, proceed- 
ing “from different angles,” built the theory 
of analytic functions of a complex variable, 
serves as another reminder of the fruitful- 
ness of different approaches in the process of 
scientific investigation. 

We conclude this chapter with an extensive 
quotation from the above-mentioned disser- 


15 9 This course of lectures, which played 
a great role in the history of mathematics, 
was attended by only three persons. (By the 
way, on p. 210 we mentioned a most impor- 
tant lecture course by Johann Bernoulli which 
was attended by only one student.) 
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tation by Riemann. 15 * 10 It embraces the main 
points of everything we have said above and 
expressively demonstrates how it all began: 

Let 2 denote a variable that can change 
without limit and assumes all possible 
real values. If to each of its values there 
corresponds only one value of another 
variable, w, then w will be called a func- 
tion of 2 . If, furthermore, while variable 2 
changes continuously and runs through all 
the values contained between two con- 
stant limiting points, the magnitude of w 
also changes continuously, then within the 
aforesaid interval this function is called 
continuous. It is clear that this defini- 
tion in no way establishes a connection 
between separate values of the function: 
if the behavior of the function is indicated 


mathematical operations, it is found that the 
value of the derivative dw/dz is independent 
of the value of the differential dz. lb - n Thus, 
it becomes clear that not every dependence of 
the complex variable w on the complex vari- 
able 2 can be expressed in the above manner. 

This characteristic property of all func- 
tions that can be defined in terms of elementary 
mathematical operations will be put at the 
basis of our further investigation..., namely , 
we will proceed from the following definition...: 

A complex variable w is said to be a func- 
tion of another complex variable 2 if both 
vary in such a manner that the value of the 
derivative dw/dz does not depend on the value 
of dz. 

...If we represent the ratio (du + dvi)/(dx~r 
dyi) in the form 


only in a certain definite interval, beyond 
this interval it can he continued quite 
arbitrarily. 

The relationship between variable w and 
variable 2 can be fixed by a mathematical for- 
mula in such a way that the value of w cor- 
responding to each value of 2 is obtained 
through certain mathematical operations con- 
ducted on the numerical value of 2 .... 

Suppose that the range of variable 2 is 
not restricted to real values but spr eads to 
complex numbers x + yi, with i = V — 1, as 
well. Let x + yi and (x + dx) + (y + dy) i 
be two infinitely close values of 2 to which 
there correspond two values of variable w , 
namely u + vi and ( u + du) + (v + dv) i. 
In this case, if the dependence of variable w 
on variable 2 is assumed to be quite arbitrary, 
the ratio (du + dvi)/(dx + dyi), generally speak- 
ing, changes with dx and dy. If we assume 
that dx + dyi = we clearly see that 



But each time the function w of variable 2 
is defined by a formula containing simple 



dx-\-dyi 


it becomes clear that the ratio has one and 
only one value for all values of dx and dy 
provided that 

du dv , dv du 

~dx Z ' ~~dy and ~fa ~~dy~' 

Thus, these conditions are necessary and 
sufficient for w — u - r vi to be a function of 
z = x + yi. This leads us to the following 
conditions imposed on the separate terms of 
the function: 

d 2 u 1 ° 2u _ a d 2 v d 2 v _ ~ 

dx 2 ‘ dy 2 ~ ’ dx 2 1 dy 2 

which serve as a basis for studying the prop- 
erties of each term of such functions when 
it is considered separately from the other term. 


Exercises 

15.2.1. Check the validity of the Cauchy- 
Riemann equations for the functions of a com- 
plex variable that were encountered in Exer- 
cise 14.1.1 and for the function w = In 2 . 

15.2.2. Suppose that w = / ( 2 ) = u ( x , y) 
+ iv (x, y) is an analytic function of the com- 
plex variable 2 and f (z) = dw/dz = u x (x, y) 
+ iv 1 (x, y) is its derivative. Prove that 
the Cauchy-Riemann equations for / ( 2 ) auto- 
matically imply the validity of the Cauchy- 
Riemann equations for f'(z). 


15 - 10 G. F. B. Riemann, Gesammelte mathe- 
matische Werke , 2nd ed.. Dover (reprint), 
New York, 1953, pp. 3-43. 


ib . 11 This statement is obviously valid 
in all cases where dw/dx can be obtained from 
the expression of w in terms of 2 via the ordi- 
nary rules of differentiation.,.. (Riemann’s 
footnote.) 



Chapter 16 Dirac’s Remarkable Delta Function 


16.1 Various Ways of Defining 
a Function 


The functions we have been studying 
up to now have ordinarily been defined 
by formulas. This means that a pro- 
cedure was always indicated for com- 
puting the values of the function for 
any given value of the independent 
variable. We could call this an algo- 
rithmic representation (algorithm mean- 
ing a procedure for computing some- 
thing). To illustrate, take the function 
y = / (x) = 2x + 3x 2 . It actually 
.amounts to this: “take x, multiply by 2, 
square x, multiply by 3, and then add 
the two numbers to get the value of y 
for the given value of x.” Trigonometric 
functions were defined differently, by 
means of geometric concepts, by mea- 
suring arcs and line segments in a circle. 
However, here too we can speak of an 
algorithm for calculating their values, 
which amounts to turning to trigono- 
metric tables or to a pocket calculator 
with the appropriate keys. 

Up to now we have studied the prop- 
erties of functions specified in this 
fashion, the laws of increase and de- 
crease of functions, the laws for finding 
the maxima and minima of functions, 
etc. The study of these properties 
leads to new ways of defining functions. 
For instance, the function / == e x may 
be defined as a function whose deriva- 
tive is equal to the function itself: 
dfldx = /, with the supplementary con- 
dition that / (0) = 1. The sine, the 
function cp = sin x, and the cosine, or 
— cos x , may be defined as functions 
that satisfy one and the same equation 


d 2 cp 
dx 2 


~ <P> 


dx 2 


~ $ 


under different initial conditions 


<p (0) = 0, 

*( 0 ) = 1 , 


dff 

dx o 


d^> 

dx o 


= 0 . 


These definitions are in many respects 
more to the point, so to say, and more 


closely related to the applications of 
the exponential function and of tri- 
gonometric functions to many problems 
of physics, say, to problems involving 
oscillations, than are the definitions 


[“=(‘+t)T 



e x — lim [i ) n ) , 

n-*-oo \ 'nil 


and of the sine and cosine as certain 
segments in a circle or as ratios of sides 
of a right triangle. 

It is curious to note that the defini- 
tions of e x , sin#, and cos# via differ- 
ential equations prove to be convenient 
for electronic computers. If in a cal- 
culation the investigator has to sub- 
stitute into a formula the values of e x 
for different values of x, he copies 
them out of a table. When operating 
a computer, it is more convenient and 
faster to have the machine compute e x 
step by step via the approximate for- 
mula /x/ e * (l _|- Ax) in accor- 

dance with the equation dyldx = y for 
the function e x (or via more exact 
formulas based on the same equation) 
than it is to refer to a table of values. 
The same goes for the functions sin x 
and cos x. It is easier to compute them 
every time simultaneously: 161 

sin (x + Ax) ~ sin x — Ax cos x, 
cos (x + Ax) ~ cos x — Ax sin x. 

Thus, one general approach to the 
concept of a function lies in specifying 
a procedure for computing it and in 
subsequently investigating it. There 
is also another approach. We can seek 
a function with definite general prop- 
erties, the aim being later to attempt 
(on the basis of these properties) to 
find the formula that describes the 
function at hand. Such is the usual 
procedure when handling experimental 


16,1 Where do you think these formulas 
come from? Check to see whether they accord 
with the equation for the second derivative 
of the sine and cosine. 
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data and finding empirical formulas by 
trial. Our object here is to construct 
in this way a remarkable function that 
is useful and important both in mathe- 
matics and in its applications. 


16.2 Dirac and His Function 

Paul Adrien Maurice Dirac (1902-1983), 
the celebrated English theoretical phys- 
icist, came to fame in 1929. He had 
elaborated a theory capable of de- 
scribing the motions of electrons in 
electric and magnetic fields with arbit- 
rary velocities almost up to the velocity 
of light. This was the quantum theory 
which also accounts for the fact that 
the electrons in an atom move only in 
specific orbits with definite energy val- 
ues. Dirac knew that an electron po- 
ssesses a definite rotational moment, or 
is similar to a spinning top, and he 
took this into account in building his 
theory. When the theory was con- 
structed, it turned out that a conclusion 
could be drawn that Dirac had not 
foreseen, namely, the existence of par- 
ticles with mass the same as the electron 
mass but with opposite (positive) 
charge. For two years it was thought 
that Dirac’s theory was good for de- 
scribing electron motion but that the 
conclusion concerning particles with 
positive electron charge was erroneous 
and that as soon as he got rid of it the 
theory would be a very good one. 

But in 1932 a positively charged 
particle with the electron mass, called 
the positron (also called the antiparticle 
of the electron), was discovered. The 
big drawback of Dirac’s theory became 
its triumph, its principal contribution: 
Dirac’s discovery was the first instance 
of a new particle being discovered “at 
the tip of a pencil.” This is an instruc- 
tive example from the standpoint of 
the relationships of theory and experi- 
ment. Theory rests on the findings of 
experiment, but a consistent, logical 
and mathematical, development of a 
theory takes the investigator beyond 
the confines of the material used as its 



Paul Dirac 

foundation and leads to fresh pre- 
dictions. 

Dirac was not only one of the best 
theoretical physicists in the world, he 
was a marvelous mathematician. In his 
classical Principles of Quantum Me- 
chanics Dirac introduced and made wide 
use of a new function, which he denoted 
by 8 (x). It is known as Dirac's delta 
function or, simply, the delta function. 

The delta function can be defined 
as follows: 8 (x) — 0 for any x 0, 
that is, for x < 0 and x > 0, and 
8 (0) = oo. In addition, the delta func- 
tion must satisfy the following con- 
dition: 

+ oo 

j b{x)dx = \. (16.2.1) 

— 00 

Figure 16.2.1 gives a pictorial view of 
a function similar to the delta function. 
The narrower we make the strip between 
the left and right branches, the higher 
must the strip (the integral, that is) 
be to retain the given value of 1. As 
the strip becomes ever narrower, we 
approach the condition 8 (x) = 0 for 
x =7^= 0, and the function approaches 
the delta function. In one of the follow- 
ing sections these arguments will be 
utilized in the construction of formulas 
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Figure 16.2.1 


that yield the delta function. Here we 
continue the study of its general prop- 
erties. 

It must be stressed, of course, that 
the “equations” defining the delta func- 
tion, 


8 



0 if 

oo if £ = 0, 


+ 0O 


\ 8 (x) dx = 1 , 


(16.2.2) 


must not be taken too literally, since 
from the standpoint of classical mathe- 
matics (that is, the mathematics we 
are accustomed to) these conditions 
are meaningless and contradictory. 
Obviously, the above-described process 
of stretching (and contraction) of the 
strips clarifies the meaning of the delta 
function, the “intuitive” definition 
(16.2.2) of the delta function can only 
stimulate our imagination in relation 
to this process. 


It is clear that by defining the delta func- 
tion we introduce a completely new object, 
so unlike any other object discussed earlier 
(there is no way in which we can say that our 
new function is analytic in the sense of Chap- 
ter 15). However, the description of this new 
object is fairly simple and must not put us 
off. For example, the number Y 2, which is 
constantly used in mathematics, is not a “real” 
number from the standpoint of arithmetic: 
it is not given by its exact value but by the 
sequence 1, 1.4, 1.41, 1.414, 1.4142, . . . con- 
sisting of approximate values of ]/" 2 (which, 
incidentally, characterizes Y 2 sufficiently in 
order to freely use it). Similarly, 8 (x) is given 
not by its exact values but by approximations 


(tall and narrow “steps”) which completely 
define the function. 16 . 2 

Dirac and other physicists have for many 
years used the delta function freely and pro- 
ductively, without any strict definitions exist- 
ing for such strange objects, just like mathe- 
maticians for centuries used irrational num- 
bers until, at the end of the 19th century, the 
first theories of the real variable appeared. 
Mathematical justification for such remark- 
able entities as the delta function (“genera- 
lized functions”, or distributions, as they are 
usually called) was given by the Soviet ma- 
thematicians S. L. Sobolev and I. M. Gel’fand, 
the French mathematician Laurent Schwartz, 
and others; this topic is discussed in many 
books. 16 - 3 

Possibly, the following important for- 
mula gives a better description of the 
Dirac delta function than (16.2.2): 

+ °o 

j f(x)6(x)dx = f( 0). (16.2.3) 

— oo 


Indeed, since 8 (x) = 0 for x ^ 0, we 
conclude that the value of the integral 
on the left-hand side of (16.2.3) cannot 
depend on the values of / {x) no matter 
what x =^= 0. The only essential value 
of / (x) is the one where 8 {x) = 7 ^ 0 , that 
is, for x = 0. This means that in the 
narrow region where 8 (x) =t^= 0 (in the 
limit the width of this region tends to 
zero; see Figure 16.2.1) 8 (x) is multi- 
plied by / (0). Hence, formula (16.2.3) 
follows from the conditions (16.2.2). 


16 - 2 For those interested, here are the “de- 
cimal” approximations of the delta function: 
y = 0 for | x | > 1, y = 0.5 for | x | < 1; 
y = 0 for | x | 3> 0.1. y — 5 for | x | < 0.1; 
y = 0 for \ x \ p> 0.01, y — 50 for | x | < 
0.01; ... (it is more convenient to “smooth 
out” these steps, so that all functions are made 
continuous); below we will give other (and 
still more convenient) definitions of the delta 
function. 

16 - 3 Possibly, the simplest of these are two 
small books by J. Mikusinski and R. Sikorski, 
The Elementary Theory of Distributions, pub- 
lished in English in Warsaw in 1957 and 1961. 
For instance, in the first book the authors in- 
troduce generalized functions as limits of se- 
quences consisting of ordinary functions, say, 
the functions (p^ (x), where cp n (x) = 0 for 
\x\> l/i0 n ,y n (x) = 10 n /2 for | x | < 1/10”; 
see footnote 16.2. (It goes without saying that 
the concept of the limit of a function requires 
a rigorous definition.) 
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We can also argue in reverse. We can 
say that 8 (x) is a function such that 
no matter what form the auxiliary 
function / (x) has, we always have for- 
mula (16.2.3). This condition alone 
brings us to all the conclusions con- 
cerning the form of 8 (x) that were 
earlier employed in the definition of 
8 (, x ). From formula (16.2.3) it follows 

that 8 (x) = 0 at x =#= 0, that j 8 (x) dx= 1 , 

and that 8 (0) — oo. 16 * 4 

Note that, of course, to grasp the 
meaning of formula (16.2.3) requires, 
like the definition (16.2.2) of the delta 
function, a certain effort of thought, 
since the integrand on the left-hand 
side of (16.2.3) contains a thing that 
can only loosely be called a function. 
However, a “rough” treatment of the 
delta function makes it possible to 
justify both the doubtful definition 
(16.2.2) and formula (16.2.3), that is 
to say, both formulas are sufficient for 
applying this new concept and must 
always be taken into account. 

Let us carry through a few more 
obvious consequences of the definition 
of 8 (x). By the general rule of a change 
of variables (discussed in detail in 
Section 1.7), the function 8 (x — a) 
is displaced a units to the right in 
relation to 8 (x), that is, 8 (x — a) = oo 
at x = a and is zero otherwise. Accord- 
ingly, 

+oo 

^ / (x) 8 (x — a) dx = f (a). (16.2.3a) 

— oo 


Now it is easy to see, if we consider 
a curve of the form depicted in Fig- 
ure 16.2.1, that 88 (x) is b times higher 
than 8 (x) and that 8 ( cx ) is | c | times 
narrower than 8 (x), so that the area 
under the curve 8 (cx) is | c J times 


16 • 4 Here the integral sign without any 
limits of integration will always be under- 
stood as being over the entire range of the 
variable of integration, from — oo to + oo. 


smaller than the area under 8 (x).~ 
Therefore, 

j / (x) b8 (x) dx= bf (0), (16.2.4). 

j / (x) 6 (cx) dx = -jij-/ (0). (16.2.5> 

A comparison of (16.2.4) and (16.2.5)) 
enables us to say that 

S(cx)==-y!r5(x). 

Formula (16.2.4) is quite obvious, but. 
we can also hope that the reader who 
has gone through the trials and trib- 
ulations of the preceding chapters oF 
this book will be able to grasp formula 
(16.2.5) as well. Formally, it is readily 
obtained by the change of variable* 

u—\c\x, dx = -^- r du. 

1 \c\ 

Here we also make use of the fact that, 
the function 8 (x) defined by formula 
(16.2.2) is an even function of its argu- 
ment: 8 ( — x) = 8 (x). 

Exercises 

16.2.1. Show that for a function q> (x)< 

having a unique zero x 0l so that q) (;r 0 ) = 0* 
and cp (x) =£ 0 at x =£ x 0 , we have the formu- 
la 8 (q> (*)) =- — y 1 ■■ 8 (x — x 0 ). What is. 

| q> (x 0 ) | 

the function 8 (\|? (x)) if the function (x)i 
vanishes at several values of x ? 

16.2.2. Evaluate the integral 

+ oo 

j (a:) 8 (sin x) dx. 

— oo 

16.3 Discontinuous Functions 
and Their Derivatives 

Let us consider the integral of 8 (x) as' 
a function of the upper limit, that is r 

X 

0 (x) = j b(z)dz. (16.3.1)* 

— oo 

It is easy to see that the graph of this* 
function has the form of a step (Fig- 
ure 16.3.1). Indeed, as long as x <C 0 r 
the domain of integration in (16.3.1)’ 
is wholly located where 8 (x) = 0 V 
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Figure 16.3.1 

which means that 0 (x) = 0 when £<0. 
But if x > 0, the domain of integra- 
tion involves the neighborhood of the 
origin, where 8 (0) = oo. On the other 
hand, since 8 (x) = 0 when x > 0, the 
value of the integral does not change 
when the upper limit changes from 0.1 
(or even 0.000001) to 1 or to 10 or to oo. 
Hence, for any x > 0 we have 

X +oo 

*0 (x) = j 8 (z) dz = j 8 ( z ) dz = 1 , 

— oo — oo 

ns is shown in Figure 16.3.1. 

Thus, with the aid of the delta func- 
tion we have constructed the simplest 

• discontinuous function 0 {x) such that 
when x < 0 then 0 (^) = 0, and 
-0 {x) =1 in the domain x > 0. 16 - 5 These 
•simple considerations enable us to ap- 
proach the problem of the derivative 

• of a function having discontinuities 
in a more consistent fashion, without 
apparent exceptions and extensive re- 
. servations. 

If we did not know about the delta 
function, we would have to say that 
derivatives cannot be found at points 
where a function is discontinuous. But 
we have just constructed a discontinu- 
ous function, 0 (x). The general rule 
of the relationship between an integral 
and the derivative (see Section 3.3) is: 

X 

if F (x) = j y (z) dz, then y ( x ) = . 

X0 


16 • 5 At point x = 0 the function 0 (x) 
undergoes a jump, and there is no need to 
-define the value of 0 (0) any further; suffice 
it to say that at this point the value of 0 (x) 
-changes from 0 to 1. 


/ -r\ ' 

Figure 16.3.2 

If we apply it to the expression (16.3.1), 
we obtain 

-^• = 8 (x). (16.3.2) 

Thus, we do not need to make an excep- 
tion for the derivative of a discontinuous 
function. We merely say that at the 
point of discontinuity the derivative 
is equal to a “singular” function, the 
delta function. 

We have learned to handle the deriv- 
ative of an elementary discontinuous 
function and can now very simply find 
derivatives in more complicated situ- 
ations. Here are some examples. Let 

y = x for x < 1 and 
y = x — 2 for x > 1. (16.3.3) 

We refer to the graph of this function 
in Figure 16.3.2. The jump occurs at 
x = 1, and the magnitude of the jump 
is y (1 + 0) — y (1 — 0) = —2. (Here 
we use the notation y (1 + 0) to denote 
the limiting value of y as x approaches 
1 from the right , that is, from the 
direction of x > 1, and y (1 — 0) de- 
notes the same on the left; see Fig- 
ure 16.3.2.) From this we formally get 

-g- = l-25(z — 1). (16.3.4) 

This notation is better than the dreary 
statement that dyldx = 1 everywhere 
except at point x — 1, where the func- 
tion has a discontinuity and does not 
have a derivative. 

The delta function is a typical brain- 
child of the twentieth century. The 
nineteenth century had a passion for 
investing all its arguments — true, not 
quite true, and simply false— in the 
form of “impossibilities.” It is impos- 
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sible to invent a perpetual motion 
machine, transformations of chemical 
•elements are impossible, it is impos- 
sible to change the total mass of a sub- 
stance, it is impossible to prove the 
existence of atoms, it is impossible 
to determine the composition of the 
stars, it is impossible to find the deriv- 
ative of a discontinuous function. The 
twentieth century has found numerous 
constructive solutions to what appeared 
to be impossible in the nineteenth cent- 
ury, say, to the problem of element trans- 
formations or to the reduction of the to- 
tal mass of a substance by transforming 
mass into energy. To take an example, 
the delta function resolves the problem 
of the derivative at a point of discon- 
tinuity (at any rate for a discontinuity 
in the form of a finite jump). Indeed, 
the notation of (16.3.4) contains in one 
line the fact of discontinuity of the 
function y (x) (since dyldx involves the 
delta function), the nature of the dis- 
continuity (a jump), the site of the 
discontinuity {x = 1), and the mag- 
nitude of the jump (the coefficient —2 
of the delta function). 

Integrating (16.3.4) with the con- 
dition x = 0, y = 0, we can restore 
the graph of y (x) in its entirety. True, 
in the case of the function (16.3.3) we 
.had it easy in the sense that a simple 
case was especially chosen where the 
derivatives on the left and on the 
right are expressed by a single formula. 
This of course is not obligatory. Why 
.should the derivative be continuous if 
the function itself suffers a discontinu- 
‘ ity? 

Let us consider a more complicated 
example: y = — x 2 for x < 1, and y = 
. x 2 for x > 1 (Figure 16.3.3). For x < 1 
we have y' = — 2x , while for x > 1 
we have y' = -\-2x. The discontinuity 
is associated with y' = 46 (x — 1). We 
can now, at our pleasure, adjoin the 
point x = 1 to the left-hand region 
and then write y' = —2x + 46 (x — 1) 
for x ^ 1 and y' — 2x for x > 1. Or, 
another version (quite similar to the 
first), we can adjoin x = 1 to the right- 
Ihand region and then, with the same 



full justification, write y' = —2x for 
x < 1 and y' = 2x + 46 (x — 1) for 
(Note how the signs < (less 
than) and ^ (less than or equal to), > 
(greater than), and ^ (greater than or 
equal to) are placed in the formulas. 
Be careful not to write the delta func- 
tion twice. 16 - 6 ) We can also write 


y' = V(x) + 46 [x — 1), 


where cp ( x ) — 


— 2x if x < 1 , 
2x if x > 1. 


To verify the notation, integrate the 
expression of the derivative. You will 
obtain the original discontinuous func- 
tion. 

Sometimes in mathematics use is 
made of the so-called signum function 
" sgu"x ~ or a csgir’jl! L ^ * T 1 1 Ts J a§nnda J Ynus : 
sgn x = —1 for x < 0 and sgn x = +1 


16 • 6 Thus, the derivative of a discontinu- 
ous function / (x) at the point of discontinuity 
x = x 0 has a definite “value” 6(x — x 0 ). 
As for the discontinuous function / ( x ) itself, 
there is no sense in asking for its value at the 
actual point of discontinuity. At any rate, 
this question is meaningless in nearly all 
applied problems. 

16 • 7 Sgn x is often read as “signum x” or 
“the sign of x” since the Latin for sign is 
signum. Sometimes it is assumed that 
sgn 0 = 0, but we will never use this physic- 
ally meaningless notation. 
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Figure 16.3.4 


Figure 16.3.5 


for x > 0. We can also write sgn x = 
xl\ x |, where | x | is the modulus 
(absolute value) of x . The curve of the 
signum function, y = sgn x, is shown 
in Figure 16.3.4a. It is easy to see that 

sgn x = — 1 + 20 (x) 

X 

= —1 + 2 j 8(x)dx. (16.3.5) 

— oo 

Now let us examine the function 
| x | itself. Its graph is depicted in 
Figure 16.3.46. We find the derivatives 
to be d \ x \ldx — —1 for x < 1 and 
d\ x \ldx = +1 for x > 1 or, briefly, 
with the aid of the new function, 

-^- = sgn.r. (16.3.6) 

From formula (16.3.5) it then follows 
that 

4^ = 25(z). (16.3.7) 

This formula is so important that one 
would do well to get a good feeling 
of it. We know that the second deriva- 
tive of a function is connected with the 
curvature of the curve on a graph (see 
Section 7.9); the curvature of a straight 
line is zero since the second derivative 
of a linear function is zero. It would 
appear then that if the graph of | x | 
consists of two straight lines, on each 
of which d 2 1 x \/dx 2 = 0, then why 


not simply say that the second deriva- 
tive is equal to zero everywhere in this 
case? Of course, the crux of the matter 
lies in the salient point at x = 0, 
where the two straight lines meet. But 
how are we to be sure that it is at the 
salient point that the value of the 
second derivative is 28 (x)? To assure 
ourselves of this, let us round off the 
salient point: take the function 

yi = Y x2,J r a 2 - (16.3.8) 

The smaller the value of a, the closer 
this function is to the broken line y = 

| x | (Figure 16.3.5a). Clearly, at a = 0 
formula (16.3.8) yields y 1 = | x |. The 
function specified by Eq. (16.3.8) is^ 
smooth and differentiable everywhere. 
Employing the rules of Chapter 4, we 
can easily find that 

dy i __ x d 2 y 1 _ a 2 

dx Yx 2 + a 2 ’ dx 2 {x 2 -\- a 2 ) 3 ! 2 

(16.3.9) 

Figure 16.3.56 depicts the graph of 
d 2 y 1 /dx 2 (both Figure 16.3.5a and Fig- 
ure 16.3.56 are constructed for a = 
0.5). It is easy to see that this curve 
becomes higher and narrower as a 

decreases; the integral ^ffiyjdx 2 ) dx is- 

equal to 2 for all values of a (see Exer- 
cise 16.3.3); whence, in the limit when 
a = 0 we obtain the expression 
d 2 1 x \ldx 2 = 28 (x) given above. 
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^Exercises 

16.3.1. Draw the graphs of the following 
(discontinuous) functions and write down the 
(first) derivatives of the same functions: (a) 
y — x if i < 1 and y — x — 1 if x^> 1; 
(b) y = el/x/(l + e 1 /*). 

16.3.2. Find the curvature of the curve 
Vi — */i ( x ) representing function (16.3.8). How 
does the curvature change when we send a 
to zero? 

oo 

16.3.3. Prove that \ y'{ (x) dx = 2 for the 

J 

— OO 

function in (16.3.8). 

16.4 Representing the Delta Function 
ly Formulas 

At the end of the previous section we 
inadvertently obtained an expression 

y = 2 (x 2 + a 2 ) 3 / 2 ’ (16.4.1) 

which approaches 8 (x) in the limit as a 
tends to zero. 16 - 8 Let us examine in 
detail the problem of an analytic expres- 
sion for the delta function. 

Let us take a function cp (x) of which 
we only demand that it vanish for 
•x = ± oo (that is, we require that 
•cp (x) 0 as x ±oo) and that the 

integral I = ^cp (x) dx be nonzero. It 

is always possible to make this integral 
<equal to unity by multiplying cp by an 
appropriate constant. Suppose that 
this has already been done, so that 

j cp (x) dx = 1. It is clear that cp (x) 

lias a maximum somewhere between 
— oo and + °°. The simplest examples 
are functions that are everywhere posi- 
tive and even, that is, symmetric about 
the y axis (see Section 1.7). Here are 
some concrete instances: 

<Pl = 2(l+;r 2 ) 3 / 2 ’ ^ = H 1+z 2 ’ 

<Ps(*) =-r^ e " x2 * 

V ji 

The graph of each of these functions 
is bell-shaped (Figure 16.4.1). If these 


16 • 8 Note the “2” in the denominator of 
(16.4.1); without it we would have 28 (x). 



Figure 16.4.1 

graphs are brought to the same height 
(see below), it is hard to distinguish 
them at a glance. Incidentally, the 
last (exponential) function is much 
closer than the others to the axis of 
abscissas for large values of x far away 
from the maximum. 16 - 9 Now recall 
what needs to be done to increase the 
height of the bell rc-fold and decrease 
the width m-fold: take ftcp (mx). If the 
area under the bell is to be preserved 
choose n = m. To summarize, then, 
the function rccp ( nx ) 8 {x) as n oo 
or, to put it otherwise, 8 {x) = 
lim racp (nx). It is also easy to formally 

n-+oo 

verify that 

^ racp (nx) dx — | cp (z) dz 

= j cp (x) dx = 1 

via the substitution z = nx. 

Thus, to the three variants of cp (x) 
correspond the following three repre- 
sentations of the delta function: 

6 L 2 (l + * 2 * 2 ) 3 / 2 ’ 

6W= ‘™ <.«+»■*■> ■ (16 ' 4 ' 2 > 

6 (x) — lim — 1 ~r~— r e~ n2x2 . r 
n-+ oo V n 

Let us verify that the procedure that 
was proposed earlier, 

6 ( x ) = 2 (*• + **)*/* ’ 

16 • 9 The intersection of all three curves cp lt 
cp 2 , and cp 3 at just about the same points in 
Figure 16.4.1 is of course purely accidental.. 
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fits this definition. To do this, we 
write 

a 2 _ a 2 

2(x 2 + a 2 ) 3 / 2 2a 3 (x 2 /a 2 +i) 3 / 2 

1 

— 2a(x 2 /a 2 + l ) 3 / 2 

and set n = 1 la to get the first repre- 
sentation of the delta function in 
accord with (16.4.2). 

We conclude that there is no single 
simple formula that can yield 6 (x). 
Clearly, the fact that 8 (0) = oo is not 
enough. To define 8 (x), it still remains 
to be demonstrated that this is precisely 
the infinity that is needed. However, 
8 (x) can be obtained as the result of 
passage to the limit (n — >- oo) from well- 
behaved (well-defined) functions of x 
that involve the auxiliary quantity n 
as a parameter. We must stress partic- 
ularly here that 8 (x) may be obtained 
by such a limit process from different 
functions cp. As long as n is finite, the 
functions m p ( nx ) differ from one another 
and, in particular, 

/ = j / (a:) m p (nx) dx^=f (0). (16.4.3) 

And only in the limit, as n ->■ oo, do 
all the distinct functions rccp (nx) tend 
to a single limit 8 (x) and the corre- 
sponding integrals 1610 (16.4.3) tend to 
/ (0): 

lim 7-/(0). (16.4.4) 

n->oo 

The arbitrariness that is evident in the 
choice of the original function cp (x) 
from which we obtain 8 (x) is in full 
accord with the essence of the matter. 
In Section 17.4, we will consider exam- 
ples of the application of 8 (x) to phys- 
ics. The description of some kind of 


16 - 10 Strictly speaking, if / (x) is discontin- 
uous at certain values of x and tends to 
infinity at these points (or / (x) -> oo as x -»■ oo 
or as x-+ — oo), not all cp (x) can be used to 
obtain (16.4.4). Furthermore, the function 
/ (x) must not be discontinuous or at least its 
discontinuities must not fall on the point 
x— 0 where 8 (x) = oo , since otherwise we will 
have those meaningless questions about the 
walue of a function at a point of discontinuity. 


action, that is to say, some kind of 
finite function t|) (x), with the aid of 
the delta function is possible and desir- 
able precisely when the detailed form 
of the action (which is to say, the true 
dependene of it on x) is inessential, the 
important thing being only the inte- 
gral. 

The examples given above do not 
exhaust by any means the diverse 
cp (x) from which we can “manufacture’ 7 
8 (x). We can give up the symmetry of 
cp (x)\ as we pass to racp (nx) and increase 
n, the distance of the maximum from 
x = 0 diminishes, that is, even an 
asymmetric function approaches 8 (x). 
Here is an example: the function 
n- 1 / 2 ?-^- 1 ) 2 passes into the function 
Jt -V 2 ^-^- 1 ) 2 whose maximum lies at 
x = \ln. 

We can give up the notation of cp (x) 
by means of a simple unified formula 
that ensures the smoothness of (p (x). 
For instance, we can take the function 
cp (x) to be discontinuous, say, 

| 1/2 for — 1 Cx<Z 1, 

1 0 for x<Z — 1 and x>l. 

(16.4.5) 

The limit process consists in our taking 
cp n (x) = nl 2 for — 1 Cnx < 1, 
i.e., — Hn < x < 1/rc, 

<Pn ( x ) = 0 for x < — 1 In and 
x > 1 In (16.4.6) 

and sending n to infinity. (Sketch the 
graph of cp (x) according to (16.4.5) and 
also cp n (x) according to (16.4.6) for 
n = 3 and n — 10.) 

Finally, we can reject the condition that 
cp (x) be positive. A curious and important 
example is 

+ 0 ) 

- 0 ) 

(16.4.7) 

The graph of R (x) for a given co is shown in 
Figure 16.4.2. The value of R (0) is equal to 
(o/ji (the indeterminate form involved in the 
vanishing, simultaneously, of the numerator 
and the denominator at x = 0 is evaluated 
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in elementary fashion). The function R (#) 
passes through zero and changes sign at x — 
±ji/a), ±2 ji/(d, dz3ic/o), .... The oscillations 
of R (x) damp out as they recede from x = 0 
due to the denominator. The curve does not 
go beyond the curves y = ± 1/ji | x | shown 
dashed in Figure 16.4.2. It can be verified that 

+ oo 

R (x) dx — 1 for all values of co. 

— oo 

It turns out that we can regard R {x) as a 
delta function if we send oo to infinity. This 
is likely since as oo is increased, the altitude 
(o/n of the principal maximum on the R axis 
grows, while the width of the half-wave — jx/co 
C x C jt/oo decreases. But how are we to deal 
with the fact that the amplitude of oscilla- 
tions does not decrease when co increases; as 
before, R attains ±1 /ji | x | and the dashed 
lines do not move toward each other? Let us 

consider the integral ^ f (x) R(x) dx. The great- 
er the value of oo, the more frequent the oscil- 
lations and the more exactly the positive and 
negative half-waves compensate each other, 
yet the contribution of the first half-wave and 
the ones closest to it are all the time the same. 
It is for this reason (we do not give the proof) 

that lim \f (x) R (x) dx = f (0), but this means 
that lim R (x) has the properties of the delta 

(D-*-oo 

function. 

Of interest is a similar function: 

k=q 

p{x)= "iir (t+2 coskx )- (is.4.8) 

fe=l 

Using the formulas of elementary trigono- 
metry, we can obtain the expression 




Figure 16.4.3 


(see Exercise 16.4.1). The graph of P (x) for' 
q = 10 is shown in Figure 16.4.3. As q oo, 
the function P (x) behaves near x = 0 just 
like R (x) does as co oo, but differs from 
R (x) in that its high maxima repeat period- 
ically at x — 0, ±2ji, ±411, .... In other 
words, P (x) represents, as q oo, a sum 
of delta functions: 

P (x) = 6 (x) + 6 (x — 2n ) + 6 (x + 2n) 

+ 6 (x — 4:rc) + 6 (x + 4jt) + . . . . 

The functions R and P and their connection 
with the delta function are not mathematical 
oddities. Recall how R and P were construct- 
ed: R is an integral of the cosine and P is 
a sum of cosines. If from cosines it is possible 
by addition (integration is a kind of addition) 
to construct the delta function, then 6 (x — a) 
can be constructed from cos co (x — a) = 
cos ooz cos c Da — sin (ox sin ooa, that is, 
from cosines and sines with constant coef- 
ficients. But then any function / (x) can be 
represented as a sum of cosines and sines. 
Any function can be replaced by a series of 
steps / (x{) A x t , and each step is actually 
6 (x — f (x t ) Ax t . Thus, with the aid of 
R and P, that is, essentially via delta func- 
tions, the possibility is proved of expanding 
functions in a Fourier series (if the function is 
periodic; cf. Section 10.9) and into a Fourier 
integral (if the function is nonperiodic). 

Reread Sections 10.8 and 10.9 after fin- 
ishing with this section. Ordinarily, mathe- 
matics textbooks do not mention the delta 
function. Many mathematicians prefer to 
keep such “physical heresy” away from the 
student as long as possible. The realization 
that actually the delta function is being used 
in the proofs will help you to grasp the mean- 
ing of these proofs. 

Exercise 

16.4.1. Prove that the right-hand sides- 
of (16.4.8) and of (16.4.8a) coincide. 


x 



Chapter 17 Applying Functions of a Complex Variable 
and the Delta Function 


17.1 Complex Numbers and Mechanical 
Oscillations 

The third part of the book is just in- 
tended to excite the reader’s curiosity in 
the functions that seem to be unlike the 
ordinary functions, such that the func- 
tions of the complex variable and the 
•delta function, and to show the pos- 
sibility, in principle, of extending the 
techniques of higher mathematics to 
these unusual objects as well. This last 
chapter of the book outlines some pos- 
sible ways of applying functions de- 
scribed in Chapters 14, 15, and 16 to 
some problems of physics and engi- 
neering. 

We will show how calculations can 
be simplified by using complex varia- 
bles using an example of forced oscilla- 
tions of a heavy body (see Chapter 10, 
-especially Section 10.5). We start with 
the equation of such oscillations. Sup- 
pose a body of mass m (the material 
point of mass m since we neglect the 
dimensions of the body) is acted on by 
an elastic force — kx which is propor- 
tional to deviation x (to make it more 
clear imagine a spring trying to return 
the body to equilibrium), a friction 
force — h (dx/dt) which is proportional 
to velocity dxldt (and directed in oppo- 
sition to the velocity), and an exter- 
nal force F = F {t). Here equation 
(9.4.2) for the body’s motion (Newton’s 
second law, see Section 9.4) assumes 
the form 

* mJ Sr=- k *~ h t+ F > 

or 

mJ lF + hJ W + k * = F - 

We confine ourselves mainly to the case 
of a periodic external force F — f cos at 
(compare this with what is said below 
at the end of this section). For a per- 
iodic force (with frequency co and ampli- 


tude f) equation (17.1.1) takes the 
form 

+ ^ = / cos CD* (17.1.2) 

(see (10.5.1)); here m , k, h, /, and co 
are constants. 171 

We have considered all the quanti- 
ties in (17.1.1) and (17.1.2) to be real- 
valued. Let us allow for the time being 
(since in the final solution we have to 
eliminate this assumption again) com- 
plex-valued solutions x to our equation; 
and let all the remaining quantities, 
which are the coefficients in (17.1.2), 
to be as before real-valued. It is more 
convenient now to replace the expres- 
sion / cos (x)t for the external force with 

f e i(*t ( := / cos cd£ + if sin o>£) (17.1.3) 

adding to the right-hand side of (17.1.2) 
the imaginary term if sin c ot. 

We find the solution to the equation 

m intr + h ir+ k:c = f eiat (17.1.2a) 
in the form 

x—ae i(dt . (17.1.4) 

Such a form of the solution is suggested 
by the fact that all the derivatives of 
an exponential function are as we know 
proportional to that function; therefore 
after substituting this expression for 
x in the left- and right-hand sides of 
(17.1.2a) both its sides will contain 
the same factor e i(dt which we can sim- 
ply cancel. Indeed, since by (17.1.4) we 


11 • 1 Since m has the dimensions of kg 
and x , dxldt, and d 2 x/dt 2 the dimensions of m, 
m/s, and m/s 2 respectively (cos co t is of 
course dimensionless), it is clear that the 
coefficients k, h , and / in equation (17.1.2) 
have the dimensions of kg/s 2 , kg/s, and kg-m/s 2 
(i.e. N, since / is the force). 
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have dx/dt = ia(»e iat and d 2 x/dt 2 = 
-awV®', equation (17.1.2a) yields 

— am a 2 e iat -f iha(oe iat + ake iat = fe ia>t , 

or 

a ( — mw 2 •+ A + zTito) = f , 
that is, 

ffl = t . (17.1.5) 

— mtifi-\-k-\- ihm 

Thus we have found that (complex) 
amplitude a of oscillations x = ae i(£>t 
(= a (cos co£ + i sin art)) is proportion- 
al to the applied force /, which is quite 
natural for a linear system (for the 
solution of the linear equation (17.1.2a)). 
When the friction force is not large 
(i.e. when h is not large), the maximum 
of the amplitude (the amplitude’s mod- 
ulus, to be more exact) is attained at 
a frequency co such that 77U0 2 — k, that 
is, at a frequency of the external force 
close to the natural frequency co 0 = 
Y him of the system (cf. Section 10.2); 
thus, the phenomenon of resonance 
occurs here (Section 10.5). 

Expression (17.1.5) for the complex- 
valued amplitude a can be rewritten as 

a — fre i(p , (17.1.5a) 

where 

rel<t = — ’ 
that is, 

_ 1 

Y { — ma ) 2 + /c ) 2 + /j 2 co 2 ’ 1 

< P = arctan ( mX-k ) • 

Then, 

x ~ ae i(0t = fre^+v \ (17.1.6) 

where r and cp are given by formula 
(17.1.5b). Thus, the concise complex 
notation (17.1.4) of the real solution 
to equation (17.1.2) gives both the 
amplitude \x \ = fr and the frequen- 
cy co and phase cp of the solution, that 
is, the phase shift of the oscillation 
with respect to the phase of the force 
assumed to be zero (see (17.1.5b)). 


Let us now return to real variables. 
Since the initial equation (17.1.1) is 
linear, the principle of superposition 
is valid for the mechanical system which 
the equation describes, that is, the 
motion caused by a composite force 
Fi + F 2 can be obtained by adding the 
motions produced separately by forces 
F x and F 2 (we have frequently used 
this principle before). In our case the 
force (17.1.3) is composed of the real 
component / cos art and imaginary 
component if sin art. But all the coef- 
ficients in (17.1.1), except for the free 
term F (the term which does not con- 
tain the unknown function x), are 
real; therefore, to a real force F there 
must correspond only real value x 
and to a (pure) imaginary force F an 
imaginary x . In other words, our linear 
system generates a real response to a 
real external force, and an imaginary 
response to an imaginary force. There- 
fore we can be sure that to the exter- 
nal force F — f cos art there corre- 
sponds the solution 

x = fr cos (co t + cp) (17.1.7) 

of equation (17.1.2), where r and cp are 
given by (17.1.5b), and to the force 
F = if sin art there corresponds the so- 
lution x = ifr sin (co£ + cp) (i.e. to the 
force F = f sin at of equation (17.1.1) 
there corresponds the solution x = 
fr sin (co£ + cp), which, by the way, 
supplies us with no new information, 
since this solution can be obtained 
from (17.1.7) by substituting t + n/2co 
for t). 

The same result can be obtained some- 
what differently. By Euler’s formu- 
las (14.3.4) the real force F = f cos co£ 
is the sum of (complex-conjugate) forces 
F 1 = (1/2) fe^ and F 2 = (1/2) fe~^; 
here due to the fact that equation 
(17.1.1) is linear (by the superposition 
principle) the motion brought about by 
the sum F x + F 2 of the forces is the 
sum of the motions produced by each 
of them separately. But we have al- 
ready seen that the force F 1 generates the 
motion x = (1/2) fre l ^ t+ ®\ where r and 
cp are given by (17.1.5b). Similarly, 
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we can find the solution corresponding 
to the force F 2 and then the sum of the 
two solutions (see Exercise 17.1.1). 

Note finally some properties of the solu- 
tion obtained which are even retained in more 
general cases where the oscillatory system 
at hand consists of a number of masses, springs 
(“returning forces”), and friction surfaces (a 
number of friction forces). 

1. The imaginary part of the “complex am- 
plitude” (17.1.5a) is negative and equals 
— ((/co/z) t)/r 2 . (Why?) This corresponds to a 
negative angle (p (or to the angle cp within the 
limits n < <p > < 2 ji). 

The negative nature of the imaginary part a 
is connected with the work performed by the 
force F on the average during many oscilla- 
tions or exactly in one cycle. If an (external) 
force is given by the equation F — f cos (Dt 
and the solution has the form (17.1.7), then 
the work of the force is 


= ^ / cos (tot) [ — co/r sin (co^-f-cp)] dt 
= — / 2 rco J sin ((Dt + (p) cos (( Dt) dt 

= ^-/ 2 r0) J [sin (2(o^ + cp) + sin cp] dt 

= / 2 rco £ — J sin cp dt 
— ^ sin (2coi + cp) dt J . (1 


(17.1.8) 


Clearly, the second integral on the right- 
hand side of (17.1.8) extended over the time t, 
which corresponds to one cycle of oscillations 
(t = t = 2ji/(d), vanishes, so that the work 
is equal to — (1/2) / 2 co (sin 9 ) t. But this 
work must be positive since during the consid- 
ered time the system returns to the initial 
position corresponding to the same values of 
both the kinetic and the potential energy; 
therefore the work of the force F will not be 
expended on the displacement of the system 
and will be spent to equalize the friction force, 
which is always negative: the friction force 
is opposite to velocity dx/dt and hence to the 
displacement dx. 

Note that during the time differing from 
the full cycle of oscillatiors, for instance, dur- 
ing the short time which is much less than T 7 , 
the work of force F can be both positive and 
negative, since during this short period the 
energy of the body (the sum of its potential 
and kinetic energies) can decrease. We can be 
sure of the sign of work only if we consider 
a very long period of time of functioning of the 
system (or the time of one cycle— a large pe- 


riod of time is made up of many similar periods 
(cycles)). 

2. The quantity a (see (17.1.5)-(17.1.5b)) 
can be considered as a function of the complex- 
valued frequency co. Here a becomes infinity 
for co = co r , where (o r is the resonance frequen- 
cy and equals % + jpi with the imaginary part 
necessarily positive, i.e. pi >> 0. That a is 
infinite, i.e. to be more exact that the ratio 
a/f is infinite, means that f/a = 0, i.e. that at 
the frequency cor free oscillations may set up 
without the action of any external force F. 
But in this case the solution x (t) — ae*®** 
satisfies the equation 

m l&r + h W + kx=0 ’ (17 - 1 - 2b > 

which is obtained from (17.1.1) by substitut- 
ing F — 0, whence it follows 

— mco 2 + ihc o r + k = 0 (17.1.9) 

(cf. the denominator in (17.1.5) for a). Clearly, 
such free oscillations (when friction is pre- 
sent) can only be damped. Therefore in the solu- 
tion e l(S>rt = q uan _ 

tity — pi must be negative, i.e. pi must be posi- 
tive. 

In our case the quadratic equation (17.1.9) 
yields 



If here k/m — h 2 /4m 2 = x 2 >> 0, then we have 
two solutions co rl = x + ih/2m and co r2 = 
— x + ih/2m with the same (positive) imagi- 
nary part ih/2m. (These solutions are almost 
complex-conjugate, more exactly co ri = — o)* 2 
and o) r2 = — w*j.) If k/m — h 2 /4m 2 <C 0, i.e. 
h 2 /^m 2 — kim = xf >* 0, then the solutions 
have the form co ri = (h/2m + Xj) i and (o r2 — 
(h/2m — x x ) i which are two pure imaginary 
numbers having positive coefficients of i , 
since x x < h/2m. 

The same techniques based on using 
complex variable x can also be applied 
in the more general case of an arbitrary, 
and not only sinusoidal, force F. In 
the case of an arbitrary periodic force 
F we can just expand the force in a 
Fourier series (see Section 10.9) and 
then use the superposition principle 
considering separately the effect pro- 
duced by each sinusoidal force (17.1.3); 
here the complex form of a Fourier se- 
ries, that is, representation of an arbi- 
trary periodic function F ( t ) with pe- 
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riod T in the form of a linear set of 
functions e ni0}t , where co = 2 n/T (fun- 
damental frequency; cf. (10.2.5)) and 
n = . . ., —2, —1, 0, 1, 2, is an inte- 
ger, that is, in the form of the sum of 
functions e in(dt taken with some coeffi- 
cients f n . In the case of a nonperiodic 
force F it must be expanded not in a 
Fourier series but in the so-called Fou- 
rier integral extended over all possible 
values of frequency co (and not over 
frequencies co n = tzcd that are multiples of 
the fundamental frequency co as in the 
case of the Fourier series); here the 
most convenient form of the Fourier 
integral is also its complex form. 

Without dwelling on the details of the 
corresponding constructions (see, how- 
ever, Exercise 17.1.2) we note that an- 
other approach is possible to the problem 
of oscillations generated by an arbi- 
trary force F = F {t): by decomposing 
force F into separate delta-like compo- 
nents which correspond to a narrow 
strip of the graph of function F , that 
is, the replacement of this function 
with the sum of functions F x (£), where 

F (t) for T^^T + dT, 

0 for all different t , 

(17.1.10) 


where co = ]/ k/m (cf. Section 10.2) and 
A = y km. 

For the derivation of formula (17.1.11) 
or (17.1.11a) see Section 17.4; in particular, 
compare (17.1.11a) with (17.4.5a). Without 
considering the reasoning used in obtaining 
formula (17.1.11a) it is also easy to verify 
directly that it gives the correct solution to 
equation (17.1.1) for the case h = 0. Indeed, 
the function on the right-hand side of (17.1.11a) 
depends on t in two ways: first, t is the upper 
limit of the integral and, second, the integrand 
depends on t (in the expression sinfco (t — 
t)]). Hence the derivative dx/dt must involve 
two terms: the first term is the result of differ- 
entiation of the integral with respect to the 
upper limit, and the second term is obtained 
in differentiating the integrand with respect 
to t. 17 - 2 But the derivative of the integral with 
respect to the upper limit is equal, by the New- 
ton-Leibniz theorem, to the value of the inte- 
grand at the upper limit (see Section 3.3); 
thus we have 

t 

dx f 

sinO'F (£) + ^4 ) co cos [co (t — t)] F (r) dx 

— oo 

t 

= Aa> j cos [o> (t — x)] F (x) dx (17.1.12) 

— OO 

and consequently 


17 * 2 It follows from the fact that if y = F (x) = 


and then using the principle of super- 
position. (The solution corresponding to 
the delta-like force F T ( t ) is termed the 
Green function of the respective differ- 
ential equation; in this connection, 
see Section 17.4.) Here we arrive at the 
solution of equation (17.1.1) having, 
in a more simple case when friction 
forces are absent (when h = 0), the 
following form 

t 

x(t) = A j e^-^F (r) dr, (17.1.11) 

— oo 

where A and co are defined by the coef- 
ficients m and k of equation (17.1.1), or 
in the “real” form 

t 

x(t) = A j sin [co (t — t)] F (t) dr, 

— oo 

(17.1.11a) 


j / ( X , x) dx, 


then 


Ay = F {x + Ax)~ F(x) 
x-f Ax 


= J f (x-\- Ax, r) dr — j / (x, r) dr 
a a 

x+Ax x 

= £ j / {x-\-Ax, r)dr—^f (x-\-Ax, r) axj 


X x 

+ £ j / (x-\-Ax, r) dr — j / (x, r) dr 1 . 
a a 


Thus, the ratio Ay! Ax decomposes into the 
sum of two fractions, the limit of the first one 

X 

is the derivative of the integral f / (x + 

J 

a 

Ax, r) dr with respect to the upper limit and 
equals the function / (x + Ax, r) at r — x 
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d 2 x 
dt 2 


— Ada cos 0 -F (£) + ^ 


i 

I- 


a) sin [co(£ — t)] 


F (t) d% 

t 

= AoaF (t) — Am 2 J sin [co (t — t)] F (t) dx 

— oo 

= A(oF(t) — <* 2 x(t). (17.1.13) 


Therefore, if c a 2 = k/m, i.e. co=|^ k/m, and 


A(o= 1/m, i.e. A = l/com = 1 / km, then 


d 2 x 

~dt 2 



— F(t ), 
m 


which is equation (17.1.1) at h — 0. 

Finally we note that the same meth- 
ods of “complex analysis” can be ap- 
plied to more complicated problems, say, 
to those related to the establishment of 
oscillations, where x = 0 at t < t 0 and 
x =/= 0 only at t > t 0 , or to the case of 
several oscillating bodies when in the 
system there act a number of “return- 
ing forces” (“springs”) and there are 
many friction surfaces (many forces of 
friction are present which are propor- 
tional to derivatives dxldt). In the 
theory of electric circuits (see Chap- 
ter 13) we also frequently encounter 
oscillatory processes— in equations re- 
lated to equation (17.1.1) the role of 
coefficients m, h, and k is played here 
by inductance L, resistance R , and 
the quantity 1/C, the reciprocal of ca- 
pacitance C, and the role of the “exter- 
nal force” F is played by the voltage 


(which transforms to / ( x , x) as kx 0); 
the second fraction is equal to 


supplied to the circuit (emf). Here as 
well methods similar to those outlined 
above (consideration of complex solu- 
tions of real differential equations) are 
used rather widely, e.g. the concept of 
“imaginary current” is familiar to all 
those acquainted with electrical engi- 
neering. 

We omit the discussion of all these 
questions. 


Exercises 

17 . 1 . 1 , Find the solution (17.1.7) to equa- 
tion (17.1.2) as the sum of two solutions to 
equation (17.1.1) corresponding to the values 
F x = (1/2) fe i(Dt and F 2 — (1/2) fe~i®t of force 

F . 

17 . 1 . 2 . Prove that the (real) function / (x) 
defined in the interval from — n to n can be 
represented as a sum (the complex form of the 
Fourier series): 

oo 

f (*> = 2 c h ihx e (=... + c_ 2 e- 2i * 

— OO 

+ + c 0 + c ± e ix + c 2 e 2ix -|- . . .), 

JT 

i r 

with c fe = — j f(t)e-ik*dt (here k = . . . , 

- JT 

— 2, — 1, 0, 1, 2, ...) and c_^==cj; if the 

values of / (x) are complex numbers (while x 
is still real), then the same formulas are still 
valid, with the exception that c_ k is not nec- 
essarily equal to c%. 


17.2 Integrals in the Complex Plane 

We consider the definite integral 

z 

w ( z ) = j / (z) dz, (17.2.1) 

a 


-^7 j [/(s + Az, T)— /(«, T)] dx 

a 

f f(x + Ax, T )—/(*, t) 

“ J Aa: d 

a 

and under ordinary conditions its limit is 


where the lower limit is a complex 
number a, while the upper limit z is 
variable. The integral here is understood 
in the same way as when we dealt with 
real variables (see Chapter 3), that is, 
the right-hand side of (17.2.1) is approx- 
imately equal to the sum 

/ {z)o Az 0 -\- / ( z l) & Z 1 + ••• T“ / ( z n- 1) & z n- 1 

= 2 / (Zi) A z it 

i=0 


dx 
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Figure 17.2.1 Figure 17.2.2 


where z 0 = a , z x , z 2 , • z n -i, z n = 

z are the points which divide the path C 
(connecting a and z) of integration into 
rather small portions, and A z t = 
*i + 1 — z i9 i = 0, 1, . . ., n — 1. 

Let us find the integration path C. 
Figure 17.2.1 shows three such paths as 
an example. We can prove, however, 
that if within the region bounded by two 
paths C and C x the function has no sin- 
gularities , that is, it has sense every- 
where and nowhere equals infinity, 
then integral (17.2.1) taken over these 
paths has one and the same value. Of 
course, when we mention the derivative 
/' {z) of the function f {z) of a complex 
variable we mean that the function is 
considered analytic. 

The property of integrals (17.2.1) 
given above admits of another formula- 
tion equivalent to the original one. Let us 


denote by qp / (z) dz the integral extend- 
ed over a closed contour, say, the con- 
tour which obtains if we go from point a 
to point z along path C 1 and then re- 
turn from point z to point a along path 
— C 1 (path C 4 joins a and z, but we go 
along the same path in the opposite di- 
rection; see Figure 17.2.1). Since (cf. 


Section 3.2) j f (z) dz 



dz , we 


-Ci 


Ci 


have 


/ (z) dz = j / (z) dz + j f (z) dz 

C -Cl 

= j f (z) dz — f / (z) dz, 

C Cl 

j f(z)dz=^f (2) dz, then 
Ci c 


^ / (z) dz — Or the integral / (z) dz of 

an analytic function over a closed con- 
tour is always zero if inside the contour 
the function has no singularities. Con- 
versely, if the integral is equal to zero 
over any closed contour, it is indepen- 
dent of the path of integration. 

In order to prove the above state- 
ment, let us first consider a small square 
with vertices 1 ( x 0 , y 0 ), 2 (x 0 + A, y 0 ), 
3 (^0 + A, y 0 + A), 4 (x 0 , y 0 -\- A) (Fig- 
ure 17.2.2). We denote by § f (z) dz 

the integtal over a closed contour join- 
ing consecutively points 1,2,3, 4, 
and again 7; we evaluate this integral. 

Let f {z) = f (x + iy) = u (x, y) + 

iv (x, y); we set A so small that we can 

assume that 

2 3 

§ / (z) dz = j / (z) dz + j / (2) dz 

1 2 

4 1 

+ j /(z) dz+ j f(z) dz~f(Zi) A z, 

3 4 

+ / (z 2 ) Az 2 + / (z 3 ) Az 3 + / (z 4 ) Az 4 , 

where Zj is understood as point 7 and 
A Zj = z j+1 — zp, here 7 = 1,2, 3, 4 
and z 4+1 = z x is again point 1 . Since 
A Zj_ = z 2 — z x = A = —A z 3 = z 3 — 

z 4 and A z 2 = z 3 — z 2 = jA = —A z 4 = 

z 4 — z l9 we have 

/ (z) dz ~ {[ u (x 0 , y 0 ) + iv (x 0 , y 0 ) I 

— [u (x 0 + A, y 0 + A) 

+ iv (x 0 + A, y 0 + A)]} A 

4- {(u (^0 “I - a, y<>) 4- i v (*« 


and if 



502 


17 Applying Functions of a Complex Variable and the Delta Function 


+ A, J/ 0 )1 — l> (*0» y 0 + A) 

+ iv ( X 0 , y + A)]} iA 
= — {[w (ar 0 + A, y 0 + A) 

— u (x 0 , i/ 0 )] + [v (x 0 + A, 

y 0 ) ~ v (*o» Vo + A)]} A 

— {y {x 0 + A, y 0 + A) 

— v {x 0 , y 0 )+ [u (x„, y 0 + A) 

— u (x 0 + A, y 0 )]} iA 
= -A- A — 5- iA, 

where ^4 and B denote the expressions 
in the braces. But 

u (*o + A , y 0 -\- A) — u ( x 0 , y 0 ) 

= \u(x o + A, j/ 0 + A) — u (x o, i/ 0 + A)] 

+ {u{x o, y 0 + A) + u(x 0 , y 0 )] 

^ r du{x o, y 0 + A) , du(x 0 , yp) 1 . 

— L 3a: ^ % J 

and 

y(a; 0 + A, y 0 ) — v(x 0 , y 0 + A) 
==[y(x«-|-A, I/,,) — y(*o + A, Po + A)] 

+ [y(a: 0 + A, p 0 + A)— y(x 0 , z/ 0 + A)] 

^ f dvjxo + A, go) , ^ (a: 0 , y 0 + A ) 1 . 

" L % + dx J ’ 


higher than A, and the integral by a quan- 
tity of order of smallness higher than the 
area A 2 of the small square (of A 3 or A 4 or 
still lower order). And even this implies that 
the integral is equal to zero. 

Indeed, let us subdivide the initial square 
with side A into n 2 squares with side A/n. 
The integral over the contour of the original 
square is, as can be easily understood, equal 
to the sum of n 2 similar integrals over the 
contours of all the small squares (we go along 
all the contours in the same direction, say, 
counterclockwise): in fact, when summing up 
all the integrals we go along each internal 
line segment two times in two different direc- 
tions and the sum of these summands will 
cancel. But while the value of the total inte- 
gral is of the order of A 3 , the value of each sim- 
ple square is of the order of ( A/n ) 3 and the 
sum of these integrals is of the order of n 2 (A/n) 3 
= A 3 In. Since this estimate is valid for any n, 
our integral must necessarily vanish. 

Similarly, that is, by dividing the region 
inside a closed contour into small squares we 
can prove that the integral taken along the 
contour of this region equals zero, thereby 
proving that integral (17.2.1) is independent 
of the path of integration. 

Example. Consider the integral 
1 + i 

/= j z*dz. (17.2.2) 

0 


where we, as before, use the approxi- 
mate formula: / (x + A) — - / (x) ^ 
/' {x) A. Thus, 


it 


dujxo, j/p + A) dv (t 0 -|- A, y 0 ) 


dx 


dy 


+t 


Vo) i dv(x 0 , j/p + A) 


dy 

f r du dv 

~ \l~te 


dx 


]) 


dy J 


[w + w]}* = ° 


by the Cauchy-Riemann relations. (In 
the last expressions the arguments of 
the functions du/dx, dvldy, duldy, duldx 
can be equated to x 0 , y 0 , that is, we 
can consider that the partial deriva- 
tives are calculated at the vertices of 
the square shown in Figure 17.2.2; here 
the error is of the order of A.) Similarly, 
we can show that B is also equal to 
zero. 


Of course, we have really only established 
that the integral is almost zero, that A and 
B differ from zero by a quantity of order 


Suppose path A goes from point z 0 = 0 
to point z = 1 + i first along the real 
axis (from z 0 = 0 to z x = 1) and then 
along a straight line which is parallel 
to the imaginary axis (from z x = 1 to 
z = 1 + 0* Here we have 

i, 

I=^z 2 dz=z j (x 2 — y 2 +2ixy)\ u = 0 dx 

A 0 

1^ 

+ j (a: 2 — y 2 +2ixy)\ x=1 idy 
0 


The numerical result is the same if we 
choose another path B first along the 
imaginary axis from point z 0 = 0 to 
point z 2 = i and then along the straight 
line which is parallel to the real axis 
(from point z 2 — i to point z = 1 + i. 
Verify this). Finally, we can take in- 
tegrals along diagonal C of the rectangle 



17.2 Integrals in the Complex Plane 


503 


(the square here) having vertices at 
points z 0 , z l9 z , z 2 : put z = (1 + i) t 9 
with 0 ^ t ^ 1; then dz = (1 + 0 dt 
and 

l + i 1^ 

/ = j z 2 dz = j [ (1 + i) t ] 2 (1 + i) dt 

o o 

i 

= (1 + 0* j t a «ft = -|-(l+0* 

0 


Integral (17.2.2) can be evaluated 
without resorting to all these simple 
arithmetic calculations. To this end, 
as in the case of integration of real- 
valued functions, it is sufficient to find 
the antiderivative F (z) of function z 2 . 
Let F (z) be the function (of course, 
analytic) of the complex variable such 
that 

^ ==/(*)• (17.2.3) 


Then, by the Newton-Leibniz theorem, 

Z 

\ f(z)dz=F (z) + C, 

J 

a 

b 

I (a, b)= jf(z)dz = F(b)-F(o), 

(17.2.4) 

where a , b, C , F (a), F ( b ) are complex 
numbers. Equations (17.2.4) can be 
proved as in the case of functions of 
a real variable: we find the change in 

z 

the integral I (a, z) = j / (z) dz at a 

a 

small change in the upper limit of 
integration: 

/(a, z + Az) — /(a, z) 

z+Az 

= j f (z) dz = f (z) Az+ . . . (17.2.5) 

2 

(we have omitted here the summands 
of | A z | 2 and even smaller orders 
of magnitude), whence in the limit at 


small (in absolute value) A z (irrespec- 
tive of the increment A z, that is, of 
the ratio of its real part A^ to the coef- 
ficient Ai; of its imaginary part) we 
obtain 


A F(z) 
A z 


~f (z). 


(17.2.6) 


From (17.2.6) it follows that the value 
of integral I (a, z) can only differ from 
the function F (z) by a constant. The 
value of the constant can be found from 
the condition / (a, z) = 0 at z = a , 
whence it follows that I (a, z) = F (z) — 
F (a) in accordance with (17.2.4). But 
since (see Section 15.2) for complex 
numbers we also have (z 3 )' = 3z 2 , 

j z 2 dz = z 3 / 3 + C from which the val- 
ue of integral (17.2.2) obtained above 
follows immediately. 

We have presented all the reasoning 
so briefly because there is no difference 
here between (smooth) functions of the 
real variable and (analytic) functions 
of the complex variable— all the re- 
sults of Chapter 3 are applied to the 
case of functions of a complex variable. 

Now let us consider specific proper- 
ties of integrals of functions of complex 
variables which have no analogs in the 
real region. Consider the integral 


^ dz (= In z). (17.2.7) 

We take a circle of fixed radius p 
with center at point z = 0 as the path 
of integtation, that is, we set z = pe iq) , 
where p = constant and 0 ^ cp^ 2n; 
then dz = i pe i( ? dtp and dz!z= idcp (equal- 
ity d (e i( P) = ie i( v dcp follows from the 
fact that by the definition of the quan- 
tity e i( P for small Acp we have e iA Q ^ 
1 + i Acp). Therefore, integral (17.2.7) 
taken along our closed contour is 
found to be equal to 

2n 

^)-^-dz=j idq> = 2ni 9 (17.2.8) 

o 
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that is, its value is far from being zero. 
The value (17.2.8) of the integral is 
independent of radius p of the circle; 
moreover, it is also independent of 
the shape of the contour enveloping 
point 0; we will obtain the same result, 
2ju, when integrating along any other 
closed contour containing 0 inside, say, 
along the boundary of a square with 
vertices ±1 ± i (see Exercise 17.2.1). 

Why is the main result concerning 
the equality to zero of integral 

^ / (z) dz found to be invalid? The 

reason is that in the given case the in- 
tegrand / (z) = 1 /z has a singularity 
inside the contour of integration: 
its value is infinite at point z = 0. 
No matter how small the subregions 
we take inside the contour, the value 
/ (z) = oo is inevitably inside one of 
the subregions; it will make its con- 
tribution — murder (and the infinity in- 
side a contour) will out! 

Let us examine in detail our inte- 
gral (17.2.7). This integral taken along 
the contour surrounding point 0 is 
equal to 2 ju. But when integrating the 
function Hz along any other path, we 
cannot also ignore the fact that the 
integrand has a singular point. 

Let 


2 



1 


(17.2.9) 


The value of this integral taken along 
the shortest path A going from point 
z = 1 to z — 2 along the real axis 
(Figure 17.2.3) is equal to In 2 ~ 0.69. 
But we can also go from point z — 1 
to point z = 2 by another path. For 
example, let us go from point z = 1 
along circle C of radius 1 with center 
at point 0 and only then go along the 
segment A of the real axis from point 1 
to point 2. Here we obtain a different 
value of integral (17.2.9): 

1 C A 

= 2m + In 2 ~ 0.69 + 6.28 i. 


8 



Figure 17.2.3 

We can go along circle C not once but 
n times and only then go along the seg- 
ment A of the real axis; here the value 
of integral (17.2.9) will be 

1 nC A 

= 1 "T" + n j "§■ = ln 2 + 2nni > 

A C 

where nC denotes the contour obtained 
in an rc-fold rotation along circle C in 
the same direction (counterclockwise). 
On the other hand we can also go along 
the same circle C in the opposite direc- 
tion, clockwise; if we go along this 
path m times we obtain the value of 
integral (17.2.9) equal to ln 2 — 2mm. 
Here the value ln 2 ^ 0.69 of integral 
(17.2.9) can be obtained over a multi- 
tude of integration paths: all paths 
which do not go around point 0, like 

path B shown in Figure 17.2.3; the 
2 

result I = j dzlz = ln 2 — 4ji i can al- 

l 

so be obtained over many (other) paths 
and so on. 

Thus, the final solution is 

2 

/= j —■ = In 2 + i2nk, 

1 

where k = 0, ±1, d=2, .... We may 
say that this is the most general deter- 
mination of the natural logarithm of 2, 
while the number 0.69 is only one of 
many values of ln 2, this is its prime 
value. Indeed e 1 — e ln2+i2nh = 2e i2nh = 
2 . 
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Thus, we have kept our word given 
in Section 14.3 and explained in a new 
way the reason for the logarithm w — 
In z being many-valued. The many-val- 
ued nature is connected, on the one 
hand, with the fact that the function 
e 1 ^ is periodic, that is, that the func- 
tion equals 1 for (p = 2 k k. On the 
other hand, the logarithm can be de- 
fined as the integral dz/z\ in such an 

approach, as well, the fact that the 
logarithm is many-valued is connected 
with the behavior of the function Hz 
at z — 0 and that In integrating we 
can avoid the point z = 0 at which the 
function f (z) = Hz is infinite (such a 
point is termed the pole of the func- 
tion of the complex variable), 

Note that it is the function Hz that, 
upon integration along the contour which 
does not surround point z = 0, yields 
a nonzero result, 2n i (while similar 
integration of function dz, where c is a 
fixed, generally speaking, complex num- 
ber, gives the result c-2ni). Clearly, 
the integral along a closed curve of 
function / (z) = a, where a is a con- 
stant number, or of function / (z) = 
az n , where n > 0 (for example, of 
function 2 z 2 or — £z 4 ), is zero— in 
fact, these functions have no singular 
points (poles). But quite unexpectedly 
integration of functions that are negative 
(and greater than unity in absolute 
value) powers of z, say 1/z 2 or 1/z 3 , 
along a closed contour surrounding 
the pole z — 0, in which the func- 
tion is infinite, also yields zero: for 
such functions / (z) we again have 

(jp/ (z) dz = 0. Indeed, substitute into 

integral ^ dzlz 2 the values z = pe Z( P, 

dz = ipe 1 ® dcp, where p = constant and 
0 ^ cp ^ 2 k; we get 

2j x 

0 


In the same simple way we can prove 


the equality <jp dz/z n = 0, where the 

integer n is greater than unity. 

To sum up, the value of the integral 
along a closed contour depends not 
just on whether the function is infinite 
inside the contour, but on the presence 
or absence in the expression for the 
function / (z) (in its expansion in 
series extended over both positive and 
negative powers of z — a, where a is 
the singular point of the function in 
which / (z) = do) of the term which is 
proportional to l/(z — a) and on the 
coefficient of this term. 17 - 3 In view of 
the fact that the coefficient of (z — a)' 1 
in the expansion of / (z) in powers of 
(z — a) is so important, it bears a spe- 
cial name, the residue of / (z) at point 
z — a . Clearly, at any point where the 
function is analytic, that is, where it 
assumes a finite value and can be expand- 
ed in a Taylor’s series (in the neigh- 
borhood of such a point), the residue 
is zero. If we take the function / (z) = 
1/z 2 , which becomes infinite at z = 0 r 
it has a zero residue at the same point. 
In contrast, the function 


Z*+l (z+j)(2— j) 
= _1 1 1 1 

2 i z — i 2 i z-\~i 


(17.2.10) 


has two singular points: z = i and 
z = — i. The residue of cp (z) at z = i 
is equal to H2i = —i/2, and the resi- 
due at point z = — i is equal to 
— H2i = i/2. Indeed, if, say, we tako 
the expansion of (—1/2 i) (z + i) -1 , it 
has no terms with negative powers of 
(z — i), so that the total coefficient of 
(z — i) _1 here is equal to l/2i. 

There exists a remarkable applica- 
tion of integrals of functions of a com- 
plex variable to the evaluation of inte- 
grals of a real variable. Here is a typi- 
cal and yet important example: 


j T^r dx - (17.2.11) 


17 • 3 We do not consider more complicated 
cases, where the integrand itself is many- val- 
ued (like, say, the function / (z) = In z). 
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Figure 17.2.4 


This is the integral of a product of a 
rapidly varying function cos ax (where 
it will be assumed that w is much high- 
er than unity) by a slowly varying 
function y x = (1 + a: 2 )* 1 - The curve 
representing the integrand is depicted 
in Figure 17.2.4a, with the dashed 
curves being the graphs of the functions 
y x = (1 + a: 2 ) -1 and y 2 = —(1 + x*)~ l . 
According to Euler’s formulas (14.3.4) , 

cos (ox = e i0)K + 4r e~ iax . (17.2.12) 

Substituting this into the right-hand 
side of (17.2.11), we split I into two 
integrals, 1+ and /_, where 


and along y we have 

l^icDzJ _ |^iO)(ac+iy)| __ |g-CDy| |^icox| 

= | e-®v\ X 1 < 1 

(since in the upper half-plane y is great- 
er than zero and thus e~ m < 1), while 

I 1 111 

| 1_|_ 2 2 — Z 2 I R 2 t 

we conclude that along y the absolute 
value of the function / (z) = e 2co 72 X 
(1 + z 2 ) is extremely small; therefore, 

the integral 174 ^ / {z) dz along y is 
v 

small, too. This implies that for R 
large, the integral 

A ft AdZ 

•BthT*- (17 ' 2 ' 14) 

r 

where T is a closed contour consisting 
of the line segment — R ^ x ^ R of 
the x axis, with R very large, and the 
semicircle y in the upper half-plane 
(Figure 17.2.5), is very close to /+ 
along the real axis. But the integral of 
an analytic function over a closed con- 
tour depends, as we know, only on the 
singular points of / (z) lying inside this 
contour. But does the function / (z) = 
£icoz/(i z 2 ) = e i( * z cp (z) have any sin- 
gular points inside our contour? 

It is clear that inside T the function 
e idz never becomes infinite and has no 



— oo 


(17.2.13) 


The second integral /_ can be written 
in a similar manner, only e i(OZ must be 
replaced with e~ i(0z . The (dummy) 
variable of integration in (17.2.11) is 
denoted here not by x but by z , which 
suggests that it assumes not only real 
values but also complex values (this 
idea is central to our discussion). 

Suppose that z runs along the semi- 
circle y: | z | = R in the upper half- 
plane (z = x + iy, y > 0), with R 
being very large. Since 

1 * icoz _ 1 l* i(0z l 

U(Z)\— 2 l + *2 2 |l + z*| 


17 - 4 We see that it can be assumed that 
| / (z) | < R~ 2 along y; this together with 

the inequality | j / (z) dz | c [ \ f (z) \ \ dz | 

(see Exercise 17.2.2), which follows from the 
very definition of an integral, and the fact 
that the length of the semicircle y increases 
like the first power of R, imply that the inte- 
gral | / (z) dz decreases, as R grows, in any 
v 

case no slower than R- 1 . 


r 



Figure 7.2.5 
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singular points. The function cp (z) = 
1/(1 + z 2 ), as noted earlier, has inside 
T the singular point z = i, and the res- 
idue of cp ( z ) at this point is — i/2. 
This implies that the function f (z) = 
(1/2) e i(0z cp (z) at its singular point z = i 
has a residue — (i/4) e i(0i = — (i/4) e ~ ° 3 . By 
the general properties of analytic func- 
tions, the integral (17.2.14) over the 

huge contour T is equal to ^ e i( * z (1 + 

cr 

z-)' 1 dz, where cr is a small circle with 
its center at point z = i; this integral 
is equal to 





4 - e-®2ni 
4 


(17.2.15) 


of the “amplitude function” y 1 = 
(1 + x 2 )" 1 . 

Integrals similar to (17.2.11) play 
an important role in the theory of vi- 
brations and in theoretical physics in 
general (compare with the integral in 
(17.1.11a)). A slowly varying pulse 
(whose shape is defined by the function 
of time / ( t ) = (1 + £ 2 ) -1 ) does not ex- 
cite a high-frequency oscillating system 
because at high frequencies (co 1) the 

oo 

integral f ^ * s sma ll* Only the 

— oo 

theory of functions of a complex vari- 
able makes it possible to establish that 
this integral falls off exponentially as 

CO — oo. 


Exercises 


The same value has the integral 
(17.2.14), which, hence, is independent 
of R. For this reason integral (17.2.13) is 
also equal to (1/2) Jte -0 . 

In a similar manner we find that 
/_ = (1/2) w 05 , which is the second 
term in the integral I. Thus, the final 
result is 

I = ue~® . (17.2.16) 

It is extremely difficult to arrive at 
this formula without using functions of 
a complex variable. Numerical inte- 
gration is also very involved since for 
co >1 the expression on the right-hand 
side of (17.2.16) or (17.2.15) is very 
.small. Separate positive and negative 
half-waves of the periodic function 
cos ($x will not be especially small: 
their amplitudes, if we are speaking of 
-waves corresponding to moderate val- 
ues of x, are of the order of unity, 
which implies that the contribution of 
a half-wave (of the type we are consid- 
ering here) to the integral I is of the 
order of zhl/co. (Why?) On the other 
hand, for co ^>1, the total value of I 
given by (17.2.15) is much smaller 
than 1/co. Hence, positive and negative 
half-waves cancel each other with a 
high degree of accuracy, which is rela- 
ted to the analyticity, or smoothness, 


17.2.1. Prove by direct calculation that 

^dz/z — 2ji£, where the integration contour K 
K 

coincides with the boundary of the square 
whose vertices lie at the points ±1 ±i. 


17.2.2. Prove the inequality 


j/w 


dz 


i 


| f (z) | \dz\, where A is an arbitrary arc 
in the complex plane. 

oo 

17.2.3. Prove that j e _<2/T2 cos cot dt = 


rjfne 0)2x2 . (This result is physically im- 
portant: the so-called Gaussian profile e~x 2 /t 2 
of a slow pulse (see Figure 17.2.46) has little 
influence- (does not excite) high-frequency os- 
cillating systems, corresponding to high fre- 
quencies co.) 


17.3 Analytic Functions of a Complex 
Variable and Liquid Flow 

In this section we conclude the large 
topic of complex numbers and func- 
tions of a complex variable. One can 
only marvel (we already mentioned 
this before) at how the introduction of 
an “imaginary unit” i = ]/ — 1 not on- 
ly resolved all the difficulties that ap- 
pear in solving the various quadratic 
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equations but also introduced harmony 
into the entire theory of algebraic 
equations (or simply algebra) as well 
as the theory of functions and mathemat- 
ical analysis. All functions can natu- 
rally be generalized to functions of a 
complex variable: not only polynomials 
and algebraic functions (only here do 
the questions concerning the number of 
zeros of polynomials and the power func- 
tions with fractional exponents gain 
completeness), but also the exponential 
function, the logarithms, and trigono- 
metric functions. We replace the general 
definition of a function as correspondence 
between two sets of variables (even 
in the real domain we are not able to 
preserve this definition of a function in 
all generality) by the definition of an 
analytic function. By restricting our- 
selves in this way, we gain colossal in- 
formation about the functions: the en- 
tire apparatus of higher mathematics 
(i.e. the concepts of a derivative, an 
integral, a differential equation, a pow- 
er series) can be applied to analytic 
functions of a complex variable; how- 
ever, many statements that are true for 
analytic functions prove to be invalid 
for functions in general. Of great impor- 
tance to analytic functions are the 
points (the values of the variables) where 
the functions become infinite. It has 
been established that the behavior of 
analytic functions at such points de- 
termines a broad spectrum of properties 
of these functions; for instance, the 
presence of such points enables evalu- 
ating integrals of functions over various 
contours without actually calculating 
them. 

It has been found that by specifying 
a function of a complex variable on a 
small segment (as small as desired) 
we can, at least in principle, determine 
the function in the entire complex 
plane, over the entire range of the in- 
dependent variable. It turns out that 
analyticity is determined not by the 
fact that there exists a simple formula 
for expressing the function in terms of 
the independent variable z: an analytic 
function may be expressed, say, by an 


integral that cannot be evaluated in 
terms of elementary functions (of the 

z 

type l e~ 22 /2 ^z) or by the solution of 

a 

differential equation that cannot be 
solved in the traditional sense (i.e. we 
cannot express the dependence of the 
unknown function on the independent 
variable by means of a formula) or by an 
infinite series. What is important is only 
that the value w = / (z) of the function 
be dependent on z = x + iy and not on 
x or y separately; in other words, the 
function must not break up the natural 
bond between x and y in their combi- 
nation x + iy. 

Functions of a complex variable 
prove to be a powerful tool for solving 
mathematical, physical, and engineer- 
ing problems. In such problems, natu- 
rally, the data is given by real quanti- 
ties. The answer must also be expressed 
in real values, since “the square root of 
minus one piece of chalk” cannot be 
allowed into everyday life. Neverthe- 
less, at some intermediate stage in a 
problem, somewhere between the for- 
mulation of the problem and the ans- 
wer, it often proves to be useful to in- 
troduce complex numbers or quantities 
and taking (but only temporarily) a 
complex number of pieces of chalk, so 
to say (compare with Section 17.1). 

For instance, in Section 17.2 we saw 

oo 

C COS (DCC 

that to evaluate \ 1 x2 dx (which is 

— oo 

a purely real integral) it is advisable to 
write cos co£ as (1/2) e i(OX + (1/2) e~ i( * x 
and thus introduce what seem to be 
quite unnecessary imaginary units. 
Even if we do not know the precise for- 
mulas that express a law of nature, of- 
ten only the assumption that there exists 
an analytic function of a complex vari- 
able that expresses the law (even a 
function that cannot be expressed by a 
formula) is fruitful and informative. The 
very fact that such a function exists 
leads to certain relationships generated 
by the analyticity of the function (a 
function unknown to us). Such an ap- 
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proach is widely used at the forefront of 
modern theoretical physics, say, in 
what is known as dispersion relations 
(unfortunately, nothing more can be 
said about these relations here). The 
greatest miracle is that the imaginary 
unit and complex numbers have proved 
to be necessary for quantum theory, the 
physical theory of the microworld. The 
place of complex numbers in quantum 
theory is so important that today in 
teaching analytic geometry to students 
of physics and chemistry it is often 
thought necessary almost at the start 
to speak of a complex plane and a space 
where points are specified by complex- 
valued coordinates, although of course 
there is no way in which such a space 
can be visualized. 17 - 5 

Apparently, the first serious applica- 
tions of the theory of analytic functions 
of a complex variable (the very crea- 
tion of this theory was due probably to 
these applications; see p. 483) were in 
fluid mechanics , which is the theory of 
the movement of gases and liquids. 
Suppose we have a plane-parallel flow 
of an (incompressible) liquid of con- 
stant density. The velocity vector v 
of any particle of the liquid is at all 
moments of time directed parallel to 
the (horizontal) xy- plane and the entire 
column of the liquid corresponding to 
given x and y coordinates and different 
altitudes z moves as a whole (in “one 
piece”). In this case we can ignore the 
third dimension and consider the mo- 
tion as occurring in the ary-plane. Sup- 
pose that v (a vector) of the flow at 
point M = M (x, y) has components 
v x and v y (Figure 17.3.1); it is clear 
that v x = v x (z, y) and Vy = v y (x, y) 
depend on point M, that is, are func- 
tions of this point, or functions of 
the coordinates x and y. 17 - 6 


17 * 5 E.g. see I.M. Yaglom, Complex Num- 
bers in Geometry , Academic Press, New York, 
1968. 

17 • 6 For the sake of simplicity we have 
restricted our discussion to the case of steady- 
state flow, when the velocity does not change 
in time and the v x ( x , y) and v y (x, y) are 
independent of time. 



In many cases it is admissible to as- 
sume that there exist functions cp (x, y) 
and ( x , y) such that 17 - 7 

dcp(;r, y) „ _ d<p{z, y) 

V *~ ~ dx ’ V V~ dy 

(17.3.1) 

d^(x, y) „ _ di>(x, y) 

U x— dy 9 U y ~ dx 

(17.3.1a) 

The function cp is known as the velocity 
potential , and the motion of liquids 
that obeys the above assumption is 
called potential flow. It can be said that 
the flow of a liquid in the vicinity of 
point M = M {x, y) is potential if and 
only if it is reduced to the transfer of a 
mass of liquid in the direction of v and 
in the deformation of a small volume of 
the liquid (which is admissible in view 
of the fluidity of the liquid) but does 
not include rotation of the liquid about 
point M (such rotation is often observed 
as whirlpools in rivers or even in a bath- 
tub). In accord with this, potential 
flow is sometimes called irrotational (or 
vortex-free) and the points (generally, 
isolated) where this condition breaks 


17 - 7 Since the functions cp and ^ are defin- 
ed only through the values of their deriv- 
atives in (17.3.1) and (17.3.1a), they are 
similar to potentials, that is, are defined only 
to within arbitrary summands C 1 and C 2 , 
which enables choosing the zero values of cp 
and at our will. Only the difference of values 
of cp (or ip) at two different points but not the 
values themselves has any physical meaning. 
(Note also that since v # and v y have the dimen- 
sions of velocity, m/s, and the coordinates 
x and y have the dimensions of length, m, the 
functions cp and ^ defined by (17.3.1) and 
(17.3.1a) have the dimensions of m 2 /s.) 



510 


17 Applying Functions of a Complex Variable and the Delta Function 


down are known as vortexes. The differ- 
ence 


fay dv x 

dx dy 


(17.3.2) 


which is zero if the flow is potential 
(why?) and, hence, is nonzero only at 
vortexes, is known as the strength of the 
particular vortex, or simply vorticity. 

The function (x, y) is known 

as the stream function . It can be proved 
that if at a point M (x, y) there exists 
a stream function op, the amount of 
liquid flowing into a small contour sur- 
rounding point M will be equal to the 
amount of liquid flowing out of the 
contour every second (the liquid is as- 
sumed incompressible). Thus, the ex- 
istence of a stream function at a given 
point M = M (x, y) guarantees that 
point M is neither a source nor a sink 
of liquid. The sum 


du x . dv y 

dx dy 


(17.3.2a) 


(which is identically zero if at point 
M = M {x, y) there exists a stream 
function ty) characterizes the strength 
of the source or sink, while the sign of 
this sum characterizes the nature of 
the “singular point” M : if (17.3.2a) is 
positive, we have a source at point M 
(i.e. more liquid flows out of the con- 
tour surrounding M than in), while if 
(17.3.2a) is negative, we have a sink. 
(The sign of the difference (17.3.2) cha- 
racterizes the direction in which the li- 
quid rotates at the vortex.) 

The reader can easily see (Exercise 
17.3.1) that the curves ^ = constant are 
simply the streamlines , that is, the 
curves along which the particles of the 
liquid flow: along each streamline (the 
tangent to each such line at each point 
M has the direction of the velocity vec- 
tor v) the magnitude of ip remains the 
same. Equipotential curves , which are 
specified by the condition cp (x, y) = 
constant, on the other hand, are per- 
pendicular at each point to the velocity 
vector (see Exercise 17.3.2): the liquid 
moves across these curves in the direc- 
tion in which the potential decreases. 


(In Figure 17.3.1 the equipotential 
curves are depicted by dashed curves.) 
Thus, the mesh of curves cp — constant 
and i]' = constant provides a complete 
picture of flow of the liquid as a whole. 

Now let us assume that in the neigh- 
borhood of a given point M (, x , y) in a 
liquid there are neither vortexes nor 
sources and sinks; in other words, both 
functions, cp and 'ip, exist. Obviously, 
from (17.3.1) and (17.3.1a) it follows 
that 


_ _ d<p _ ^ 

x dx dy 9 


_ Jhp_^ d ^ 

U V ~ dy dx ' 


(17.3.3) 


Equations (17.3.3) are close in form to 
the Cauchy-Riemann equations (15.2.6), 
which connect the real and imaginary 
parts of an analytic function w = 
w (z) = u (x, y) -f iv (x, y) of a com- 
plex variable z = x + iy. Thus, if we 
assume the xy - plane to be the complex 
z plane, then we naturally arrive at 
the analytic function 

w = w (z) = cp + hp (17.3.4) 

of complex variable z , a function that 
is closely linked to liquid flow. 

The function (17.3.4) is called the 
complex potential of liquid flow . It 
completely characterizes the flow of a 
liquid, so that the study of various 
plane-parallel flows is reduced to the 
study of analytic functions w {z). The 
derivative of the complex potential, 


v 


dw 

dz 


( 


d cp 
dx 


dip 

dx 


= —v« 



(17.3.5) 


(see (15.2.3)), is also an analytic func- 
tion of complex variable z. This func- 
tion is called the complex velocity of 
flow . This name suggests a relationship 
between the complex number v and the 
velocity v; obviously, 

I v I 2 = (— v x f + {VyY = vl + vl 
= \ V I 2 , 

that is, the absolute value of v is equal 
to the length of v; on the other hand, 
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Arg v characterizes the direction of 
vector v (see Exercise 17.3.3). Flows 
specified by the complex potentials 
w = cp + ity and iw = — ■ + £cp can be 
called conjugate , because the stream- 
lines of one flow coincide with the equi- 
potential curves of the other, and vice 
versa. 

Quite close to the above scheme of liq- 
uid flow is the use of analytic functions 
in the classical theory of the electromag- 
netic field. 17 - 8 Let us consider a flat 
electric field. We will assume that with- 
in the plane region we are considering 
there are no free electric charges or vari- 
able magnetic fields (the presence of 
such fields is equivalent to the presence 
of charges, in accord with the law of 
induction). Suppose that ® ® (x, y) 

is the potential of the electric field at a 
point M = M (x, y). Then the electric 
field strength at this point is fixed by 
vector E = (E x , E y ) with coordinates 


E x = 


dx ’ 




d cp 

dy * 


(17.3.6) 


In addition to potential ® we can intro- 
duce the streamline of the field , T* = 
W (x, y), such that 17 - 9 




d T 
dy ’ 


F - dW 
h y dx 


(17.3.6a) 


The curves Y = constant are the lines 
of force of our electric field, while the 
curves <J> = constant are the equipo - 
tential curves . By virtue of (17.3.6) 
and (17.3.6a), the function 

W = ® + W (17.3.7) 


is an analytic function of the complex 
variable z = x + iy; it is called the 


17 • 8 Precisely, here we can speak only of 
electrostatics and magnetostatics , since two 
coordinates (x and y) prove to be insufficient 
for the general problems involving electromag- 
netic phenomena— three spatial and one time 
variable are required. 

17 • 9 Since and W have the dimensions 
of voltage, V, and the coordinates x and y 
have the dimensions of length, m, it is clear 
that the dimensions of E x and E y are those 
of V/m (we can also say that V/m are the 
dimensions, of E). 


complex potential of the electric field* 
The derivative 


E=— ( = — 

' dz \ dx 

= — + iEy ) 


dV 

dx 


(17.3.8) 


of the complex potential can be called 
the complex electric field strength , since 
this complex quantity is very close in 
meaning to the electric field vector E; 
in particular, | E | = | E |. The ele- 
ctric fields with W = O + V¥ and 
iW = —W + i<& are sometimes called 
conjugate , the streamlines of one coin- 
cide with the equipotential curves of 
the other, and vice versa. 

We will not dwell any further on the 
various ideas of the mathematical theo- 
ries of liquid flow and electric field, 
which have to do with functions of two 
variables x and y and, respectively, 
with differential equations involving the 
partial derivatives of these functions, 
which are simply known as partial 
differential equations (see Section 10.8). 
We note only that the theory of analy- 
tic functions of a complex variable 
plays an extremely important role in 
these purely applied problems. 


Exercises 

17.3.1. Prove that (a) the curve = con- 

stant is a steamline of liquid flow; in other 
words, that the tangent to this curve at each 
point M {x, y) of this point coincides in di- 
rection with the vector v ( x , y), and (b) if 
AB is an arbitrary arc (in the plane of liquid 
flow) with endpoints A — A y ± ) and 
B = B ( x 2 , y 2 ), then liquid flow across the 
arc per unit time is equal to the difference 
^ (^ 2 ? 2 / 2 ) — (#!, yf) (the increment of the 

streamline along arc AB). 

17.3.2. Prove that the equipotential curves 
cp = constant and the streamlines = con- 
stant are perpendicular at each point in the 
liquid flow (precisely, the tangents at point 
M ( x , y) to the streamline and equipotential 
curve passing through this point are perpen- 
dicular to each other). 

17.3.3. Suppose V = V (x, y) is a com- 
plex number such that OV = v (where OV 
is a vector with the initial point at 0 (0, 0) 
and the terminal point at P, and v is the ve- 
locity vector at point M ( x , y)). Prove that 
points v and V in the complex plane, where v 
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is the complex velocity of flow, are symmetric 
with respect to the imaginary axis (and, hence, 
Arg v = ji — Arg V). 

17.3.4. Describe (plane-parallel) flow of 
an (incompressible) liquid for the following 
•complex potentials W = cp + ity: (a) w = nz, 
where a is a real or purely imaginary number, 
(b) w = az 2 (a is a real or purely imaginary 
number), (c) w = a! z ( a is a real or purely 
imaginary number), and (d) w = In z and 
w — i In z. 


17.4 Application of the Delta Function 

Jf Sections 17.1 to 17.3 logically be- 
longed to the material of Chapters 14 
and 15, since in these sections we dealt 
with complex numbers and analytic 
functions of a complex variable, the 
present section is a continuation of 
Chapter 16, which must be looked 
through before going on. We note also 
that the first example, which starts this 
section, is closely related to the content 
of Section 9.12, while the second exam- 
ple, which deals with Newton’s second 
law (17.4.1), continues the discussion 
in Section 9.5 about the impulse and 
the motion of a particle moving under a 
short impulse, say, a blow. Therefore, 
Vdrore ^tuhynxg "hrese xwanrjJreb ’ib 
advisable to go over the appropriate 
sections of Chapter 9. 

First of all, we will show you how 
the delta function permits abridging 
and making the writing of the condi- 
tions in many problems more convenient. 

Let us consider a rod with a variable 
cross section to which a number of sep- 
arate point loads of masses m 1? m 2 , 
etc. are attached (Figure 17.4.1). Let 
the mass per unit length of the rod be 
expressed by the function a (x). The 



nij m 2 


mass of the rod without the loads is 

b 

a (x) dx, while with the loads it is 

a 

b 

M = j a ( x ) dx-\~ 2 m i-> 

a i 

where the summation is extended over 
all the loads attached to the rod. The 
position of the center of gravity is 

b 

x = ~w ( j xo W dx+ 2 x i m i) ■ 

a i 

The moment of inertia with respect to 
the origin is 

b 

1= j x 2 o (x) dx-y 2 x\m 

a 

But with the aid of the delta function 
it is possible to include the separate 
masses in a generalized density func- 
tion, which we denote byr} (x). It is de- 
fined by the formula 

T] (3?) — O' (x) + S ( x — X i) • 

i 

Indeed, if we consider the general dis- 
tribution of mass along the rod, we can 
say that at the points where the loads 
are attached the density exhibits in- 
finite jumps. With the aid of the new 
function r] (x) all the quantities can be 
written in uniform fashion and more 
succinctly: 

b b 

M = v\(x) dx, X = -J— j xt\ (x) dx, 

a a 

b 

I = j xh] (x) dx. 

a 

The concept of the delta function 
permits combining continuously dis- 
tributed masses and point masses in a 
single general expression. 

Another example of the use of the 
delta function refers to the motion of 
a mass point (or material particle). The 


Figure 17.4.1 
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basic equation, it will be recalled, 
expresses Newton’s second law: 

= (17.4.1) 

(see Section 9.4). Recall the arguments 
in Section 9.5 to the effect that the 
action of the impulse is independent of 
the law of variation of the force, pro- 
vided that the force is sufficiently 
brief. These considerations are similar 
to what was said in Section 16.4 to the 
effect that the delta function can be 
constructed out of a variety of functions 
<p (x) and concerning the conditions 
when it is possible to replace a finite 
function cp (x) with the generalized, 
singular function 8 (x). 

If the concrete form of the function of 
the force is not essential in the problem 
about a blow, this means that F (t) 
may be replaced by the delta function, 
F ( t ) /8 (t — t), where x is the in- 
stant of the blow, and J — \^F (t) dt 

is the impulse. We will carry out the 
integration of the equation of the mo- 
tion under a unit delta force formally 
and according to all the rules. Let the 
particle, prior to the blow, be at rest 
at the origin: x = 0 and v = dxldt = 0 
at t = — oo. We will also assume that 
the (unimportant) factor J in the ex- 
pression for the generalized force is of 
the order of unity. Then Eq. (17.4.1) 
takes the form 

< i7 - 42 > 

Integrating (17.4.2) and bearing in 
mind that d 2 xldt 2 = dvldt, with v ( t ) 
the velocity of the particle, we get 

t 

v (t) = 4r j S(t-r)dt 

— OO 

= 4-0 (t-x), (17.4.3) 

where the function 0 {x) is expressed 
by formula (16.3.1) (see the graph of 
this function in Figure 16.3.1). Thus, 
the velocity is expressed by a step-like 


ir(t)\ 

m 


0 T t 



Figure 17.4.2 


function of time t (Figure 17.4.2a): 
v = 0 for t <C x and v = llm for t > x. 

The next step consists in determining 
the path. From the fact that v = dxldt 
we get, by integrating both sides of 
Eq. (17.4.3), 


f 0 for t <! x, 

1 — (* _T ) f ° r t > % - 


(17.4.4) 


(Why?) The graph of the path is shown 
in Figure 17.4.26. 

Characteristic of the curve x ( t ) of 
the path is the salient point at t = x. 
Here again we are convinced that the 
second derivative of a function having 
a salient point contains the delta func- 
tion: the function x ( t ) has a salient 
point. According to the equation of mo- 
tion, the force is proportional to d^xldt 2 , 
and x (£) with a salient point was ob- 
tained precisely for a force that was pro- 
portional to 8 (t — x), so that in the 
case of a salient point, d^xldt 2 contains 
a delta function, which is what we set 
out to prove. 

Now let us take the next step. The 
problem of the motion of a body under 
a given force is linear . This means that 
if there are two solutions, x x (£) and 
x 2 (t ), corresponding to two distinct 
forces, F^t) and F 2 {t), the sum of the 
solutions, x ( t ) = x ± (t) + x 2 ( t ), is a 
solution that corresponds to the action 
of the sum of the forces, F {t) = 
Fi{t) + F 2 {t). This property (the su- 
perposition principle) is a consequence 
of the simple fact that the second deriv- 
ative of a sum of functions is the sum 


33-0946 
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of the second derivatives of the func- 
tions: 

d*x __ d 2 (x x + x 2 ) d 2 x x . d 2 x 2 

dt 2 dt 2 dT 2 572“ - 

Taking into account that d 2 xjdt 2 = 
F x (; t)/m and d 2 x 2 /dt 2 = F 2 (£)/m, we 
get 

dH __ £i_(£) . F 2 (t) _ F(t) 
dt 2 t?i ' m m ’ 

which is what we set out to prove— 
that the sum of the solutions, x x + x 2 , 
describes the motion produced by the 
sum of the forces. 

Only one reservation is in order: so- 
lution of the equation of motion depends 
not only on the law of force but also on 
the initial conditions, that is, on the 
initial position and velocity of the giv- 
en mass. If we choose these conditions 
thus: x x == 0 and dxjdt = 0 at t — 
— 00 , and ^2 = 0 and dxjdt = 0 at 
t = — 00, then the sum of the solu- 
tions, x , will also satisfy the same con- 
dition: x = 0 and dx/dt = 0 at t = 
— 00. 

Let us now combine the reasoning 
concerning linearity and the familiar 
solution for the delta function so as to 
obtain a general solution for a force 
that depends on time in an arbitrary 
fashion. We partition the graph of the 
force F (i t ) into strips of width At (Fig- 
ure 17.4.3). What does a separate strip 
located between % and t + At of time 
t represent? Let us change the designa- 
tions, leaving t for “current” time 
varying from — 00 to +00, whereas t 
will refer to the given chosen strip. The 
height of the strip is F (t), the width 
is At, and the area (i.e., the impulse) 
is F (t) At. Since the strip is located 
at t — t, it is obvious that it can be 
replaced by the delta function with a 
coefficient equal to the impulse: 



F (t) At8 (t — t). We already know the 
solution of the equation of motion for 
the delta function. We denote it by 
x i (£» t). The solution as the function 
of time t depends on the instant t of 
application of the force. Recall that 

f 0 for tCr, 

x,(t, t ) (17.4.4a) 

v m 

One of the strips into which the force 
has been decomposed, from t to t + At,, 
is 8 (t — t) with the coefficient 
^(t) At. Thanks to the linearity of the 
equation, the solution for a force in 
the form of such a strip is obtainable 
by multiplying x x into that coefficient: 
F (t) At x x ( t , t). This is the solution 
for the action of a single strip. 

Now let us take advantage of line- 
arity and write out the solution for the 
function F (t), which we consider as 
the sum of the strips. It is clear here 
that the summation should actually be 
replaced by integration. The result is 

00 

x(t)= J (t, t) F (t) dx. (17.4.5) 

- 00 

At first glance this formula is rather 
strange: the ^-coordinate at time t is 
expressed by an integral from — 00 to 
+00 with respect to t, that is, the force 
enters into this expression at all in- 
stants of time. Yet it is clear that the 
law of force subsequent to time t does 
not affect the preceding motion. How- 
ever, there is no error in the expression 
for x ( t ). The properties of the func- 
tion x L ( t , t) ensure reasonable proper- 
ties of the solution. Indeed, x x ( t , t ), is 
zero for t <Z t. Hence, when integrat- 
ing with respect to t we actually do not 
need to take t > t, since the integrand 
is identically zero due to the factor 
x x ( t , t) being equal to zero. Recalling 
the expression (17.4.4a) for x x ( t , t ), we* 
obtain 

t 

*(«)=- l - j (t-x)F(x)dr. (17.4.5a) 


Figure 17.4.3 
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This method of obtaining a solution 
has very great general significance. To 
summarize then: if for a linear system 
we know a solution referring to the 
action of the delta function, the solu- 
tion referring to the action of an arbit- 
rary function (F(t) in our example) is 
obtained by simple summation (inte- 
gration, to be more precise). 

The ideas of linearity and addition 
(the real term is superposition) of solu- 
tions apply not only to such simple prob- 
lems as the motion of a point; they 
hold true in vast areas of mathematics, 
physics, and the natural sciences. It some- 
times happens that a system is very 
complicated and it is impossible to 
solve the equations even for the most 
simple action of the delta function. A 
solution corresponding to the delta 
function can occasionally be obtained 
experimentally. In other cases such a 
solution can be obtained from physical 
reasoning (see Exercise 17.4.1). Then 
linearity comes into play and we obtain 
the answer for an arbitrary acting func- 
tion. The solution that corresponds to 
the delta function (%(£, t) in our exam- 
ple above) is so important that it has 
a special name, the Green function of 
the problem. Curiously enough, the 
English mathematician George Green 
(1793-1841), for which the function was 
named, lived in the 19th century and 
quite naturally knew nothing about the 
delta function. 1710 But it was only 


17 • 10 The author of excellent papers in 
mathematics and mathematical physics, Green 
came from a family so poor that he did not 
receive any education at all; it was not until 
comparatively late in life that he became 
acquainted, through his own efforts, with 
mathematics and physics. At the end of his 
life, though, chiefly thanks to support from 
the influential William Thomson (Baron 
Kelvin), Green was offered the post of pro- 
fessor of mathematics at Cambridge Univer- 
sity. 



the introduction of the delta function 
that clearly and succinctly explained 
the essence of the Green function. 

Examples of this nature abound in 
mathematics, for we know numerous re- 
sults pertaining to tangent lines, areas, 
and volumes that were obtained before 
the invention of derivatives and inte- 
grals. The advance of science lies not 
only in the attainment of new heights 
and new results but also in popularizing 
and simplifying earlier derivations. The 
aim of this book is precisely that: to 
simplify the understanding of our clas- 
sical heritage, the fundamentals of high- 
er mathematics. 


Exercises 

17.4.1. Consider a string held taut by 
a force k and with ends fixed at points x = 0 
and x — l. Regarding the deviation as small, 
determine by the parallelogram-of-forces law 
the shape of the string under a unit load at 
point x = x x (Figure 17.4.4a). Obtain the 
formula for deviation of the string under a 
force distributed along the length of the string 
via an arbitrary law / (x) (Figure 17.4.46), 

17.4.2. Find the motion of a pendulum un- 
der a force expressed by the delta function, 
that is, solve the equation m ( cPx/dt 2 ) = 
— kx + 6 (t — t) provided that x = 0 and 
dx/dt =0 at t — — oo. Using this solution, 
find the motion of the pendulum under a 
force dependent on time via an arbitrary law. 
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Higher mathematics, or, to be more 
exact, differential and integral calcu- 
lus, makes it possible to solve a large 
class of problems that do not lend them- 
selves to solution by the methods of 
elementary mathematics (arithmetic, 
algebra, and geometry). Of tremendous 
importance is the very formulation of 
such new concepts as instantaneous 
velocity, acceleration, impulse. These 
notions (and numerous others in diverse 
fields) are formulated exactly only in 
the language of derivatives and inte- 
grals. 

The knowledge you have gained in 
reading this book constitutes but a 
small part of the whole science of mathe- 
matics, to be exact, a small part of 
those divisions of mathematics that find 
application in the natural sciences and 
technology. In the Preface we said 
that a cultivated person should have a 
general picture of mathematics, irre- 
spective of his or her field of interest. 
The three parts of our book provide 
enough general information for that 
purpose. But if your field is connected 
with engineering, chemistry, or (espe- 
cially) physics, you will need more. Here 
we wish to outline in brief the fields 
of physics and the associated divisions 
oi x lLmAvsi J ikeo 

ly study in the future. 

Note that up till now the exposition 
has been that of a textbook, and if you 
have put your mind to the matter at 
hand, you will have mastered the mate- 
rial in all its details. What now follows 
is a very short outline of the difficult 
problems that lie ahead. The style is no 
longer that of a textbook. We do not 
hope to explain the content of complex 
mathematical theories on so few pages, 
of course. We wish merely to give the 
reader a general impression of the prob- 
lems of these theories to arouse inte- 
rest in them. Some of the books listed 
in the Selected Readings will help you 
to obtain a broader view of mathema- 
tics and of mathematical physics. 

For a better understanding of what is 


to follow, let us briefly state a common 
feature of all the problems we have dealt 
with up to now. These were problems 
involving the motion of a single par- 
ticle in mechanics or the flow of current 
in a circuit. We dealt with functions of 
one variable (time). The number of 
functions was one (current as a function 
of time) or two (the position of a body, 
x = x (t), and the velocity of the body, 
v = v (; t )). 

Quite naturally there follows the 
purely quantitative generalization: prob- 
lems involving the motion of two bod- 
ies, three bodies, etc., will ultimately 
lead to problems of the motion of a gas 
or a liquid. But one gram of hydrogen 
consists of 3 X 10 23 molecules, hence 
3 X 10 23 separate bodies. It must be 
clear at this point that the old method 
is now useless and new methods are 
needed. Not only is it impossible to 
solve 3 X 10 23 equations, there is nei- 
ther paper nor time enough to write 
them down. Even writing them down is 
out of the question because we can never 
know either the exact number of mole- 
cules or their initial positions, which 
would then yield the initial conditions 
imposed on the equations. 

To our aid come new sciences that 

integral calculus. These are hydrodynam- 
ics and gasdynamics, which did not 
come into being until the 19th centu- 
ry. Both differ substantially from the 
mechanics of a single point in the for- 
mulation of problems and in the meth- 
ods used to solve them (compare Sec- 
tions 9.14 and 9.15 and Section 17.3). 
Here we are not interested in the actual 
number of molecules of a gas contained 
in a volume. Instead, such notions are 
introduced as the distribution of the 
density of the gas in space, p = p (x, y, z) 
(the mass of the gas per unit volume), GA 

G#1 Strictly speaking, the introduction 
of the notion of density at a point in space, 
p (x, y, z ), requires an operation similar 
to differentiation. This quantity is defined 
as the limit of the mean density of the gas 
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the pressure of the gas, p = p (x, y, z), 
and the velocity of the gas at different 
points in space. What is more, all these 
quantities depend on time, t , so it is 
natural to write p = p (x, y, z, t). 
Thus, from problems of several func- 
tions of one variable we pass to func- 
tions of several independent variables. 

Accordingly, in setting up the equa- 
tions that describe the motion of a gas 
and other characteristics, we encounter 
partial derivatives with respect to time 
and the spatial coordinates, for ins- 
tance, dpldt, dpldx, dpldy, and dp/dz , 
where, say, 

dp _ Hty1 Pl£i y±A Hi z i £hl£i£i h z i £) 

** Ay™ 

An extremely important division of 
mathematical physics is the investi- 
gation of partial differential equations ; 
we gave a very rough outline of them 
in Section 10.8. The equations describe 
the motion of liquids, gases, and sol- 
ids, the propagation of heat in various 
media, the diffusion of atoms and mole- 
cules. They are so important that they 
are often called equations of mathe- 
matical physics . 

Another point that must be mentioned 
is that velocity is a vector quantity, 
v = v (x, y, z) or even v = v (x, y , z, t), 
which means that at every point in a 
gas (or liquid) the magnitude and di- 
rection of the velocity are specified. In 
other words, we can say that three com- 
ponents of the vector v, namely, v x , v y , 
and v z , are given and that these compo- 
nents depend on the point in space and 
on time. This leads to more complica- 
tions (or simplifications) in problems of 
hydrodynamics and gasdynamics; how- 
ever, we will not go into them here. 

In all these theories it is possible, at 
least in principle, to continue regard- 
ing the system as consisting of separate 
particles and as being described by 
many functions of one variable (time). 


contained in a small volume (the mass-to- 
volume ratio) as the volume becomes ever 
smaller. Changes in the density due to one 
or two molecules leaving or entering the 
volume do not interest us here. 


But there are other physical theories, 
primarily the theory of electromagne- 
tism , where such an approach is im- 
possible. 

Consider two point charges at rest. 
The force acting between them depends 
on their position (the distance r between 
them). This would seem to be a prob- 
lem involving six functions (the coor- 
dinates y u and z 1 of one charge and 
the coordinates x 2 , yz, and z 2 of the 
other) of one variable (time). To a first 
approximation, the motion of the charges 
changes but little — one merely has 
to take into account the magnetic in- 
teraction that appears between the 
charges, an interaction that depends on 
the velocities of the particles. 

An extremely important factor— one 
demanding a fundamentally new ap- 
proach — is the existence of a lag in the 
interaction due to the propagation of 
the interaction with a finite speed (the 
speed of light). The action of one charge 
on the other at time t depends on the 
position (and velocity) of the first charge 
at some earlier time t — t, which 
lags behind t by a finite amount T = 
r/c, where c is the speed of light. What 
takes place in this interval of time? 

In a vacuum the space between the 
charges contains an electric field and a 
magnetic field. At each point in space we 
specify two vectors, E and H, called 
the electric field strength and the mag- 
netic field strength , respectively. The 
magnitude and direction of each vector 
depend on the presence and motion of 
charges. Small (“test”) charges placed at 
certain points in space provide the pos- 
sibility of measuring the fields E and 
H at the given points. But there is 
still more to the theory of electromag- 
netism. In the vacuum, where there are 
no charges, the electric and magnetic 
fields act on each other: a change (over 
time) in one field is related, via Max- 
well’s equations, to the spatial deriv- 
atives of the other field. The time de- 
rivative of the magnetic field generates 
a circular electric field, while the time 
derivative of the electric field plays the 
same role as electric current and gener- 
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ates a magnetic field. The result is a 
neat pattern: charges generate fields 
at the points where they exist, and the 
interaction of the fields carries the in- 
formation concerning the position and 
motion of the charges throughout the 
entire space. 

Decisive in the development of this 
theory is a consideration of the two 
vectors E and H, or the six quantities 
E x , E y , E z , H x , H y , and H z , as func- 
tions of four quantities: the three coor- 
dinates x, y, and z and the time t. But 
specifying these functions at a certain 
time amounts to the fixing of an infini- 
tude of their values at all points of 
space. The theory is more difficult than 
the theory of the motion of one or sev- 
eral particles, but the results are 
richer. 

Mathematically, the theory of the 
electromagnetic field is a theory of par- 
tial differential equations and in this 
respect is similar to the theory of elas- 
ticity, acoustics, and gasdynamics. The 
only difference is that in the latter case 
the equations are arrived at via an 
idealization: in speaking of the density 
of a gas we abstract ourselves from the 
separate molecules and only in this 
approximate sense can a gas be consid- 
ered as a continuous medium character- 
ized by a continuous function p(a:, y, z , t) 
(the density of the gas). An electric 
(or, more precisely, an electromagnet- 
ic) field is actually a continuous func- 
tion of the spatial coordinates and 
time. 

Hydrodynamics, the foundations of 
which were laid in the 18th century by 
the works of D. Bernoulli, J. d’Alem- 
bert, and L. Euler, prepared the mathe- 
matical tools for the electromagnetic 
theory worked out in the second half of 
the 19th century by James Clerk Max - 
well (1831-1879). No wonder, then, 
that at first attempts were made to 
transfer the ideas of mechanics to the 
electromagnetic theory. A special sub- 
stance, called the ether , was hypothe- 
sized as being responsible for electric 
and magnetic phenomena. We know 
that the mathematical analogy between 


hydrodynamics and electromagnetism, 
and also the similarity of the equations 
of the two theories, remains; however, 
the physical meaning of the electro- 
magnetic theory has proved to be differ- 
ent and does not reduce to mechanics. 
After prolonged and intense debates 
scientists completely abandoned the 
idea of any ether. 

When speaking of a mathematical 
theory, one must not only speak about 
the statement of the problem and the 
initial equations but also about the na- 
ture of the results. 

We can name two types of solutions 
for partial differential equations. One is 
characteristic of phenomena that devel- 
op in a limited volume: these are natu- 
ral oscillations with definite frequencies. 
A body of a given shape has a certain 
set of frequencies. 

Recall the pendulum and its definite 
frequency of oscillation. If an external 
(periodic) force F acts on the pendulum, 
we get the characteristic phenomenon 
of resonance when the frequency of F 
is close to the (natural) frequency of the 
pendulum. All these physical ideas fit 
the theory of ordinary differential equa- 
tions: m (i d^xldt 2 ) = — kx + F(t). In 
the theory of partial differential equa- 
tions it appears that a body has many 
frequencies and behaves like a set, or 
collection, of many pendulums with 
distinct frequencies, whence there are 
many resonances. c - 2 You can verify 
this at once if you have a piano at home. 
Depress one of the keys slowly and 
soundlessly, so as to release the string 
without striking it with the hammer. 
Now strike the other key sharply and 
listen to the response of the free string. 

The other type of solution of partial 
differential equations has to do with 
matter or fields that fill all space (pro- 
pagation of waves). These are the waves 
of radio and light (in the electromagnet- 
ic theory) and sound waves in elastic 
media. Waves have the remarkable 
property of being able to carry infor- 


c * 2 If you wish to understand this aspect 
in detail, go back to Section 10.8. 
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mation: pressure or an electric field at 
one point (near the receiver) as a func- 
tion of time proves to be similar to 
the curve of that same source quantity 
(near the transmitter) as a function of 
time. 

It is possible to construct the solu- 
tions of equations describing the direct- 
ed beam of a searchlight or laser. The 
beam of a searchlight and the jet of 
water from a hose are strikingly simi- 
lar. Knowledge of the properties of so- 
lutions of different kinds of problems 
and analogies between phenomena de- 
scribed by similar types of equations 
have always been extremely important 
in the development of physics. 

The method of mathematical model- 
ing consists in finding a mathematical 
scheme so close to the phenomenon un- 
der consideration that an examination 
of the scheme can replace a study of 
the phenomenon. The scheme might be 
either an ordinary differential equation 
or a system of such equations. Some- 
times it is a partial differential equation 
with initial and boundary conditions of 
one kind or another (see Section 10.8). 
In other (and more sophisticated) ma- 
thematical constructions we have to 
deal with the notion of probability , on 
which we will dwell below. It is essen- 
tial that the model should present a 
sufficiently accurate description of the 
given real-life phenomenon. c - 3 If, in so 
doing, we come to an earlier-studied 
mathematical scheme, we can immedi- 
ately transfer to the new case all the 
findings pertaining to it. For example, 
from the fact that the mechanical vi- 
brations (see Chapter 10) and electro- 
magnetic oscillations (see Chapter 13) 
are described by differential equations 
of the same type it follows that the 
theory of electric circuits has a number 


c * 3 The words “sufficiently accurate” em- 
phasize the idealization (that is, making 
a rough approximation of the phenomenon 
under study by disregarding details that are 
secondary and do not interest us at the mo- 
ment) that we invariably encounter when 
passing from physical reality to the mathe- 
matical scheme describing it. 


of attributes (resonance, for instance) 
that are characteristic of mechanical 
vibrations. 

The power of mathematics is connect- 
ed largely with the repetition of one 
and the same mathematical scheme 
applied to a large number of phenomena 
of different kinds. Mathematics might 
be described as a rather limited set of 
“keys,” each of which unexpectedly 
opens up numerous doors that appear 
to be different (compare, for instance, 
Sections 2.1, 2.2, and 2.5). This cir- 
cumstance is connected with the fact 
that nature’s arguments are simple — 
there are no special intricacies — and 
there are not so many simple mathemat- 
ical constructions. The fantastic gene- 
rality of mathematical methods has 
always amazed scientists. c * 4 

For a long time atomic spectra were 
a mystery to physicists. It was not so 
much the specific laws and numerical 
values of the frequencies but the very 
fact that one and the same atom emits 
or absorbs (via resonance) the oscilla- 
tions of several distinct but quite defi- 
nite frequencies. The similarity to the 
oscillations of elastic bodies suggested 
the formulation of the equations of 
quantum mechanics . Likewise, the si- 
milarity between a stream of particles 
and solutions of equations particular 
to waves found its application in quan- 
tum mechanics, where it gave rise to 
particle-wave dualism. For instance, 
light can be regarded as a wave (the 
viewpoint of Christian Huygens) and 
at the same time as a stream of parti- 
cles (or corpuscles) of light, or photons 
(Newton’s concept; see Section 12.6 and 
subsequent sections). In pre-20th-cen- 
tury physics these two approaches 
seemed to contradict each other, and a 

G * 4 See, for instance, the following arti- 
cles by several distinguished physicists: 
E.P. Wigner, “The unreasonable effectiveness 
of mathematics in the natural sciences” in 
Symmetries and Reflections, Indiana Univer- 
sity Press, Bloomington, 1967 (reprinted by 
Ox Bow Press, Woodbridge, Conn., 1979), 
pp. 222-237, and G.N. Yang, “Einsteims 
impact on theoretical physics” in Physics 
Today , 33, No. 6, pp. 42-49, 1980. 
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choice had to be made. Modern quan- 
tum mechanics shows how and in what 
sense both points of view are correct 
and supplement each other. 

We have already presented several 
examples pertaining to the theory of 
equations of different types. 

An example of a different kind is 
offered by complex numbers and func- 
tions of a complex variable (see Chap- 
ters 14, 15, and 17). Complex numbers, 
which originally arose from the solu- 
tion of algebraic equations, were for a 
long time regarded with much mis- 
trust. However, progress in hydrodyna- 
mics and gasmechanics turned out to be 
closely connected with these unusual 
“numbers.” The theory of functions of a 
complex variable was evolved by Augu- 
stin Cauchy and Georg Riemann and 
further developed by Karl W eierstr ass 
(1815-1897) just when it was needed to 
make further scientific and technical 
advances possible. The rise of aeronau- 
tics at the beginning of the 20th cen- 
tury was based in large measure on cal- 
culations of fluid flow by the methods 
of “complex analysis.” One of the pio- 
neers in this field was Nikolai Zhukov- 
sky (1847-1921), the distinguished Rus- 
sian investigator in the field of me- 
chanics. 

Geometry offers another marvelous 
example of the development of mathe- 
matics. 

Ordinary experience teaches us that 
in space it is convenient to introduce 
three coordinates: x, y, and z. Any fur- 
ther complications would seem to be 
superfluous, “a trick of the devil.” Yet, 
coordinates £, rj, and £ can be intro- 
duced in a different way so that the 
condition £ = constant corresponds to 
some curved surface (whereas x = 
constant for arbitrary y and z is the 
equation of a plane perpendicular to the 
x axis). For instance, it is possible to 
introduce into space the three coordi- 
nates p, cp, and #, where p = OM is the 
distance between the variable point M 
and the origin <9, while cp and d are the 
geographical coordinates (longitude and 
latitude) on the sphere p = constant. 


(In this case the surfaces cp = constant 
are half -planes and the surfaces 0 = 
constant are cones (why?).) Coordinates 
can also be introduced by countless 
other methods. 

To summarize, then, we can introduce 
curvilinear coordinates £, rj, and £, 
and with a lot of effort, agonizingly, 
learn to compute point-to-point distances 
and other quantities with the aid of 
these new coordinates. 

At first glance this is a dull and to- 
tally useless effort. One must possess a 
peculiar bent of mind to see beauty in 
overcoming difficulties and in devel- 
oping a clumsy theory with arbitrary 
coordinates of the most general na- 
ture. c - 5 

Elaboration of the theory of curvili- 
near coordinates in Euclidean space 
might appear to be purely an achieve- 
ment of method. However, the general- 
ized coordinates £, rj, and £ are equally 
convenient (or equally inconvenient) in 
describing ordinary space (in which 
Euclidean geometry holds true) and 
“curved” space. The rectangular coor- 
dinates x , y, and z are convenient for 
ordinary space but are absolutely un- 
suitable for a description of curved space. 
These coordinates do not give even 
a hint of the possibility of the existence 
of any other kinds of space. 

The transition to curvilinear coordi- 
nates, which seemed to be such a need- 
less complication, actually prepared us 
for a vast range of spaces whose very 
existence was totally unknown to us. 
Then it turned out that the force of 
universal gravitation is linked up with 


G - 5 Such was the mind of the great Gauss, 
who worked out a “geometry in curvilinear 
coordinates.” Displaying a profound under- 
standing of the applied significance of mathe- 
matics (for example, Gauss arrived at the 
idea of curvilinear coordinates from his 
work in geodesy), he also possessed an insati- 
able curiosity concerning the secrets of the 
world of mathematics, which prompted him 
to devote hours and days to arduous cal- 
culations in the hope of some new results (say, 
the conversion of common fractions to deci- 
mal fractions with hundreds of figures in the 
period of the fraction). 
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the fact that space is somewhat curved. 
True, this “somewhat” had to do with 
the conditions here on the earth and in 
the solar system. In certain phenomena 
of a larger scale (catastrophic explosions 
of stars, evolutionary processes in the 
Universe), space may turn out to be 
highly curved, and here one must inevi- 
tably go over from conventional Eucli- 
dean geometry to non-Euclidean. Me- 
taphorically speaking, the creation of 
non-Euclidean geometry, a purely ma- 
thematical achievement, c - 6 was a flash 
of lightning, and it was followed by a 
clap of thunder — the creation of the 
general theory of relativity, or the geo- 
metric theory of gravitation. Today 
physics is being invaded by completely 
new kinds of geometry, which at first 
glance seem quite removed from any 
reality. 

Back in the 18th century, d’Alem- 
bert drew attention, in his article 
“Dimensionality” (in the famous En- 
cyclopedic written together with Denis 
Diderot (1713-1784)), to thefactthat our 
world is essentially four-dimensional — 
besides the three spatial coordinates x, 
y, and z of a point, it is also necessary, 
in order to distinguish events taking 
place in the world, to know the time, t. 
Moreover, the connection between the 
coordinates x, y, z, and t of the four- 
dimensional physical world later proved 
to be more complicated than New- 
ton and d’Alembert had originally imag- 
ined; in particular, in this “world” it is 
impossible to distinguish unambiguous- 
ly between the spatial coordinates x, y, 
and z and the temporal coordinate t 
(this is the essence of Einstein's special 
theory of relativity). Later (beginning, 
essentially, with the works of Lagrange, 
a junior contemporary of d’Alembert) 


0,6 Again, how amazing that the non- 
Euclidean spaces that were so necessary for 
progress in physics came into being at just 
the right time in the works of mathematicians 
(Nikolai Lobachevsky (1792-1856) of Kazan, 
Russia, John Bolyai (1802-1860) of Hungary, 
and Carl Gauss; then Georg Riemann and, 
later, Felix Klein (1849-1925) of Germany 
and Henri Poincare (1854-1912) of France). 


the idea of multidimensional spaces 
entered into theoretical physics. For 
instance, phase space — for the simplest 
case of a moving point, the phase space 
is six-dimensional , with coordinates^:, 
y , 2, x , y' , and z' (x =[ dxldt , y' — 
dyldt , z! — dzldt ), which are the spatial 
coordinates and the projections of the 
velocity vector of the point. 0 7 Today, 
too, in theoretical physics we often have 
to deal with multidimensional worlds 
that are simply impossible to imagine;, 
finally, we have the ultimate monstros- 
ity — an infinite-dimensional world each) 
point of which has an infinitude of coor- 
dinates (expressed as numbers— and 
we are lucky if the numbers are^real 
and not complex). Instances of such 
infinite-dimensional spaces are found in 
quantum mechanics. It was most for- 
tunate for physicists that the famous; 
German mathematician David Hilbert 
(1862-1943) evolved a theory of infinite- 
dimensional spaces just before the ap- 
pearance of quantum mechanics. 

In recent years topology , one of the- 
newest and most abstract sections of 
geometry, which took its start in the- 
works of Henri Poincare, has found unex- 
pected applications. Topology deals, 
exclusively with the “roughest” prop- 
erties of geometric figures. For instance, 
in topology a sphere does not differ 
from a cube, but a doughnut (torus) does 
because it has a hole. Topology ana- 
lyzes these “rough” geometric properties 
with great thoroughness and depth. All 
of a sudden this not-so-trivial analy- 
sis aroused the interest of physicists. c - 8 

G * 7 The representation of a physical pro- 
cess in the phase space of a (mechanical) 
system is known as the “ phase portrait ” of the 
process. For instance, the phase portrait 
of the uniform motion of a point along the z 
axis, z = at + 6, is the straight line z' = a 
in the phase plane (z, z r ) of the moving point 
(here z' = dzldt is the velocity of the point). 
The phase portrait of harmonic oscillations,, 
that is, oscillations characterized by a con- 
stant total energy E = kx 2 / 2 + m (x') 2 l 2, is. 
described by a point in the phase plane (x, x') 
moving along an ellipse E = constant. 

0,8 We would like to draw the reader’s- 
attention to the excellent, though by m> 
means easy, textbook by B.A. Dubrovin*. 
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.And no one can say what other esoteric 
-approaches to space will be concocted 
by physicists of the future. G * 9 

At the intersection of (multidimen- 
sional) mathematical analysis and to- 
pology there lies a new theory, the 
theory of catastrophes. (Catastrophe 
-theory is the invention primarily of the 
Trench mathematician Rene Thom 
{b. 1923).) It immediately gained (per- 
haps excessively wide) recognition in 
^connection with the many different 
possibilities of its use in the natural, 
humanitarian, and social sciences. G * 10 
The theory poses the question of the 
^conditions under which smooth and ge- 
nerally well-behaved analytic functions 
^ c iiwilra* 

^explosions, that is, a discontinuity in 
the final result. Simple examples of 
this kind were known long before the 
mame “catastrophe theory” appeared. 

Let a quantity a be given as a smooth 
:function of another quantity, b , or 
a = / (b), and suppose that a can be 
fixed and then b can be observed or 
measured. The smoothness of the func- 
tion a = f(b) by no means guaran- 
tee tbit ejmvtfhnf&jb 
tion b = cp (a). For instance, if a = 
f ( b ) has a maximum a = a max at 
b = b m , then for a < a max there are 
two values of b near b ni , whereas for 
a > a max there is not a single value of 
b. Hence, b experiences a discontin- 
uity as a varies smoothly. 

A.T. Fomenko, and S. P. Novikov, Modern 
* Geometry : Methods and Applications, 3 parts, 
.Berlin, Springer, 1984, which is intended 
largely for theoretical physicists. 

G ' 9 An interesting but not simple booklet 
which deals with the overall question of the 
relationships between physics and mathematics 
iromthe present-day viewpoint is Yu.I.Manin’s 
.Mathematics and Physics , Birkhauser, 
Boston, 1981. The two papers mentioned 
in footnote G.4 deal with the same question. 

G,i0 See, for example, the article by 
E.G. Zeeman “Catastrophe theory” in Scientific 
American , 234, No. 4, 1976, pp. 65-83, and 
"V.I. Arnold’s book Catastrophe Theory , Spring- 
er, Berlin, 1984. A thorough treatment of the 
subject is presented in Catastrophe Theory and 
Its Applications, Pitman, London, 1978, by 
T. Poston and I. Stuwart, which also con- 
tains many expressive examples. 


We often meet phenomena of this 
kind in the theory of combustion and 
explosion, where the conditions imposed 
on the chemical reactions are deter- 
mined via algebraic equations. The ex- 
ternal conditions enter the problem as 
parameters, but the solution of the 
equations may be a discontinuous func- 
tion of the parameters. Discontinuities 
are therefore specific features of the so- 
lutions. Catastrophe theory shows that 
these features can be classified by di- 
viding them into several “rough” (topo- 
logical) types corresponding to the char- 
acteristic phenomena described by a 
given equation. 

In the study of nature, a diversified 
* i desired: • + 'xf * 

mathematical difficulties, the mastering 
of the mathematical apparatus, physi- 
cal intuition, boldness of conception, 
experimentation, and mathematical mo- 
deling. All these combined make it 
possible to advance science. 

Modern physics would be absolutely 
inconceivable without the concept of 
probability , which is quite different 
from anything discussed in this book. 

ALL Lb a, hue oA mnjiTjfviRfL 

above — Newton’s second law mx = F 
and its particular case, the law of iner- 
tia (see Section 9.4), Boyle’s law 
pV = constant and its generalization, 
van der Waals’s law (see Section 7.3)— 
were of a dynamical nature, that is, 
they enabled us to make a precise pre- 
diction of the phenomenon: when the 
volume occupied by a gas is halved, its 
pressure doubles (Boyle’s law); if the 
motion of a body takes place in the 
absence of any forces acting on it, its 
velocity in future will be exactly the 
same as it was at the start; and so on. 
The laws of the falling of, say, a ran- 
domly tossed coin are altogether differ- 
ent. Here all one can say in advance is 
that when a coin is tossed into the air 
many times, it will fall heads up in 
approximately half of the cases, that is, 
the probability of its falling heads up is 
equal to 0.5 (this result is guaranteed by 
the condition that the coin is true, that 
is, both sides are absolutely equal). The 
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physical laws by which one cannot 
predict the exact outcome of an exper- 
iment but only the probability of one 
outcome or another, that is, the fre- 
quency with which the outcome is re- 
peated when an experiment is carried 
out many times under the same condi- 
tions, are called statistical laws , and 
the branch of mathematics that deals 
with them is known as the theory of 
probability . GAi 

The example of a coin being tossed many 
times is the simplest, almost a trivial, instance 
of ^probabilistic process characterized by the 
accidental nature of the outcome of its sepa- 
rate stages (in this case, separate tossings). 

Here is a still more typical example of the 
same kind, used by physicists as a simple mo- 
del of the process of the diffusion of gases (the 
model of Paul and Tatiana Ehrenfest 0,12 ). 
Let us take a vessel divided into two parts, 
A and B, by a porous membrane. It contains N 
particles (gas molecules, say) distributed in 
some way between A and B : at each moment 
of time, one of the particles is chosen at ran- 
dom and transferred from its part of the vessel 
to the other. We can pose the question of how 
many times, on the average, a transfer must 
he made so that the initial distribution, 100 par- 
ticles in A and 0 particles in B , becomes 50 
particles in A and 50 particles in B. It turns 
out that about 140 transfers are needed. The 
reverse process is also possible (at least in 
principle): with an initial distribution of 50 
particles in A and 50 in B, random transfers 
•can lead to 100 particles in A and 0 in B , but 
this requires 2 100 ~ 10 30 transfers. This example 


Cl1 The most elementary concepts of the 
theory of probability are explained for begin- 
ners in B.V. Gnedenko and A.Ya. Khinchin, 
An Elementary Introduction to the Theory of 
Probability , W.H. Freeman, San Francisco, 
1961, in K.L. Chung, Elementary Probability 
Theory with Stochastic Processes , 3rd ed., 
Springer, New York, 1979, and in F. Mosteller, 
R.E.K. Rourke, and G.B. Thomas, Jr., 
Probability with Statistical Applications, 2nd 
«d., Addison-Wesley, Reading, Mass., 1970. 
Two other good books are: W. Feller, An 
Introduction to Probability Theory and Its 
Applications , 2 vols., Wiley, New York, 
1968, 1971, and J. Neyman, First Course in 
Probability and Statistics , Holt, Rinehart, 
and Winston, New York, 1951. 

c * 12 Paul (or in full, in Russian, Pavel Si- 
gizmundovich) Ehrenfest (1880-1933) and Tati- 
ana Alexeyevna Ehrenfest-Affanasjewa (1876- 
1964) were prominent 20th-century physicists 
who worked in Russia (and, later, in the 
USSR) and the Netherlands. 


illustrates the concepts of thermodynamic 
irreversibility and fluctuations. c * 13 

The case of two vessels is an idealization, 
of course. Typical problems deal with the 
movement of particles in space, where we seek 
the law that governs changes in the particle 
distribution depending on the coordinates and 
time. When it is a question of large particles, 
visible under a microscope, we speak about 
Brownian motion. It is characteristic that 
the physicists who studied Brownian motion 
(A. Einstein, M. Smoluchowski, A. Fokker, 
and M. Planck) turned to probabilistic pro- 
cesses long before the problem attracted the at- 
tention of mathematicians (A. Kolmogorov, 
A. Khinchin, W. Feller, Paul P. Levy, and 
others). For this reason the main equation 
of (continuous) processes allied to the “toy” 
process of transferring particles we have ex- 
amined is called the Einstein- Smoluchowski 
equation or the Fokker-Planck equation or 
the Kolmogorov equation. 

It is typical that present-day physics 
pays careful attention to the statistical 
laws of nature and, hence, makes exten- 
sive use of the mathematical tools of 
the theory of probability. For instance, 
an aspect of the transition from the clas- 
sical mechanics of Galileo and Newton 
to quantum mechanics is the replace- 
ment of the “dynamical” world of the 
rationalists in the 17th to 19th centu- 
ries by the random world of modern 
physics of the microcosm, in which one 
cannot in principle speak of the exact 
trajectory of a particle (say, an elec- 
tron) but only of the probability of 
finding the particle at one or another 
point in space at a certain time. 

Finally, another relatively young 
branch of mathematics that has acquired 
primary importance for physics is 
the theory of groups . This theory stud- 
ies the “degree” of symmetry of physi- 
cal or other objects. It is clear, for exam- 
ple, that a square (Figure C.la) is more 
symmetrical than an isosceles trapezoid 
(Figure C.16) and that the latter is more 
symmetrical than a trapezium (Fig- 
ure C.lc). The exact meaning of these 
statements is as follows. It is clear that 


C. 13 For further details and proofs of the 
results we have just stated see, for instance, 
J. G. Kemeny and J. L. Snell, Finite Markov 
Chains , 3rd printing, Springer, New York, 
1983, Section 7.3. 
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Figure C.l 

the symmetry group of the square, a 
group consisting of all the rotations 
that transform the square into itself, is 
broader than the symmetry group of 
an isosceles trapezoid or that of a tra- 
pezium. The symmetry group of a square 
contains symmetries with respect to 
all the diagonals and the midlines and 
also with respect to rotations about the 
square’s center through 90°, 180°, and 
270°, while the symmetry group of an 
isosceles trapezoid is limited only to 
symmetry with respect to the midline 
MN (see Figure C.l b); the symmetry 
group of a trapezium (Figure C.lc) is 
even more limited, consisting merely 
of a single rotation through 360°, which 
leaves all the points in place and does 
not contain a single true movement! 

The theory of groups was created by 
a brilliant young French revolutionary 
named Evariste Galois (1811-1832), who 
was killed in a duel when he was only 
20. c 14 (The duel had evidently been 
provoked by the police.) Galois devel- 
oped the ideas of this theory as applied to 
algebraic equations in the early 1830s. 
He suggested classifying equations in 

0,14 Read Leopold Infeld’s book Whom the 
Gods Love : The Story of Evariste Galois , 
McGraw-Hill, New York, 1948. 


accordance with their inherent “sym- 
metry groups” and had no idea whatever 
of the possible physical applications of 
the concepts he had evolved. What is- 
more, at the beginning of the 20th cen- 
tury the prominent English astrophysi- 
cist James Hopwood Jeans (1877-1946) 
energetically protested against the in- 
clusion of elements of group theory in 
the course of mathematics for students 
of physics because he thought physicists 
would never have any use for it. Today,, 
when the theory of groups has acquired 
such great significance in physics,. 
Jeans’s statement is often quoted as an 
example of a prediction that failed. 

When scientists speak of the role 
that group theory plays in physics today, 
what they have in mind mainly is its 
application to the fundamental ques- 
tions of the structure of matter, first and 
foremost to the theory of elementary 
particles. But the first practical appli- 
cations of the ideas of symmetry were in 
crystallography. After all, crystals are 
highly symmetric bodies. It was found 
that crystals could be classified on the 
basis of their symmetry, and that such 
a classification sheds substantial light 
on all their properties. A big event at 
the end of the 19th century was the 
enumeration of all possible systems of 
crystal symmetry — there proved to bo 
230. These are the Fedorov groups , so 
named in honor of Evgraf Fedorov 
(1853-1919) who, along with A. M. Schon- 
flies, a German, and V. Barlow, an En- 
glishman, discovered the crystallogra- 
phic groups. 

Physicists also found use for the 
Shubnikov groups named after the So- 
v iet cry st al lographer Aleksei Shubn ikov 
(1887-1970), who made a study of the 
symmetry of colored (for instance, black 
and white) ornaments. His theory also 
takes into account the colors of separate 
sections. The point is that other physi- 
cal properties of objects (for example, 
electric charge) can play the role of 
color (black and white). 

An enumeration of the Fedorov (and 
Shubnikov) groups required a thorough 
analysis of the very concept of “a sys- 
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tem of symmetry,” and an understand- 
ing of the meaning of the operation of 
composition of rotations, which substi- 
tutes one rotation of a crystal for an 
equivalent of several rotations. 

It is clear that each movement of a 
crystal or any other body that trans- 
forms it into itself constitutes a cer- 
tain transformation in space; we can 
•call a composition, that is, a sequence 
of two transformations a and P (first (3 
and then a), a product: y = ap. For 
example, the composition of two reflec- 
tions a with respect to the straight line 
Ox or of four rotations n through 90° 
constitutes an identical transformation 
c, which returns each point to its place: 
<y 2 = 8 and Jt 4 = e (Figure C.2). Thus 
we arrive at the “arithmetic” (or “al- 
gebra”) of symmetries and the possibil- 
ity of utilizing the precise tools of 
mathematics in a “symmetry calculus.” 
This calculus of transformations (or 
symmetries) is the subject of the theory 
of groups. A peculiar situation arises 
here in that aP may not be equal to 
Pa, or that operations in symmetry cal- 
culus may be noncommutative. c - lb For 
instance, the product an of a rotation 
through 90° and a reflection with re- 
spect to the x axis (first the rotation, 
then the reflection) is equivalent to a 
single reflection with respect to the 
straight line x + y = 0 (Figure C.3a), 


C* 15 The first system of noncommutative 
numbers, called quaternions , was discovered 
by the famous Irish algebraist, astronomer, 
and physicist William Rowan Hamilton (1805- 
1865). His quaternions differ from the com- 
mon complex numbers x + iy = 2 in that 
they contain three imaginary units, 7 , 
and so that the general quaternion is w = 
u -f- xi —{— yi -f- zk, where w, x , y and z 
are real numbers. The square of each imagi- 
nary unit is — 1 , but they are noncommuta- 
tive, so ij — k , but ji — — k. Hamilton’s 
calculus of quaternions was the first version of 
vector calculus ( he called a quaternion without 
imaginary units, w Q = u, a scalar and used 
the term vector for a purely imaginary quater- 
nion, — xi + yj + zk ). The renowned 
Treatise on Electricity and Magnetism (1873) 
di frames ™iei J K A ihaxweii was written ’in tne 
language of quaternions, to which Maxwell 
was accustomed, and not in the language of 
vectors so familiar to us. 
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Figure C.2 

whereas the product na (first the reflec- 
tion, then the rotation) is equivalent 
to an entirely different operation— a 
reflection with respect to the straight 
line x — y = 0 (Figure C.3&). 

All these considerations played a sub- 
stantial role in the study of the sym- 
metry of crystals. Recall what was said 
in Section 14.1 about the origin of com- 
plex numbers. Counting began with the 
natural numbers 1, 2, 3, . . .. Later 
there was a need for negative numbers, 
— 1, — 2, — 3, and so on, and also zero; 
at the same time, fractions came into 
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being and also irrational numbers, such 
as ]/ 2. A further generalization of the 
number concept led to complex num- 
bers; only after the introduction of these 
unusual numbers was the theory of 
equations complete (in particular, the 
elementary theory of quadratic equa- 
tions). In Chapters 14, 15, and 17 we 
showed how fruitful the generalization 
of the number concept was, especially 
when combined with the derivative and 
the integral. 

However, mathematicians could not 
bring themselves to challenge the prin- 
ciple of commutativity in numbers— 
for all newly introduced numbers it 
was always a + b = b + cl and ab = 
ba . What Galois accomplished should, 
therefore, be regarded as a most funda- 
mental advance: he was the first to con- 
sider groups whose elements are non- 
commutative; in general, in the “arith- 
metic of Galois” ab =^= ba. In everything 
else, the theory of crystallographic 
groups is close to ordinary arithmetic: it 
has a unit element such that the product 
of this element with any member of 
the group, in either order, is that same 
member (in the “calculus of symmetries” 
the role of the unit element is played by 
the “identical transformation” e), the 
operation of division is introduced as 
the inverse of multiplication (a -f- b = 
ab -1 = c if a = cb), and soon. Howev- 
er, the fact that multiplication may be 
noncommutative later played a most 
important role in physics. 

Although the theory of groups was 
put to active use in the theory of crys- 
tals in the 19th century, the problems 
that arose seemed very remote from 
the profound problems of the main “buil- 
ding blocks of the Universe” — elemen- 
tary particles. In 1932, however, the 
famous German physicist Werner Hei- 
senberg established that replacing a 
(positively charged) proton in a nucleus 
by an (electrically neutral) neutron is 
similar to a rotation in mathematics. 
This discovery was a brilliant page in 
the history of physics. Actually, it was 
from this moment that groups and non- 
commutative operations and the theory 


of symmetry, regarded broadly, entered 
theoretical physics. And as physics 
brought to light more and ever newer 
particles, the idea of symmetry in the 
properties of particles and the applica- 
tion of the theory of groups became in- 
creasingly significant. The theory grad- 
ually became more complex as rota- 
tions were examined in an imaginary 
multidimensional space and account was 
taken of particle charge and spin. Today 
the approach to physical objects from 
the standpoint of their inherent sym- 
metry is perhaps the prime method that 
physicists use as they try to analyze the 
great variety of elementary particles 
brought to light in the numerous exper- 
iments and theoretical works of scien- 
tists in the second half of the 20th cen- 
tury. 

Modern physics makes extensive use 
not only of the theory of discrete ( cry- 
stallographic ) groups , which, in a 
way, copies elementary arithmetic (non- 
commutative arithmetic!) but also the 
theory of continuous groups , whose ori- 
gins lie in algebra and mathematical 
analysis. Along with the question of the 
group of operations that transform the 
square in Figure C.la (or a crystal) 
into itself, one can also pose the ques- 
tion of the symmetry group for the entire* 
plane or for three-and multidimension- 
al spaces. For example, the symmetry 
group of a Euclidean plane consists of 
all its motions 

x' = x cos cp + y cos cp + a, 

y' = — x sin cp + y cos cp + b, (C.l) 

where angle cp and the numbers a and 
b are arbitrary (see formulas (1.9.6)). 
The fact that the angle cp of rotation 
and the quantities a and b characteriz- 
ing the translation of a figure in the 
plane may be arbitrary permits us to* 
speak of an infinitely small rotation 
(through an infinitely small angle Acp or 
dcp, with dcp 0) or an infinitely small 
translation. This in turn makes it pos- 
sible to introduce the concept of a 
“group trajectory” (defined, say, by the 
fact that a and b are constant and cp va- 
ries at our choice) and also the concepts 
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of the differential, derivative, and inte- 
gral in the group space. Here we can 
also define the main functions, such 
as, for instance, the exponential e x (giv- 
en, say, by formula (6.2.2), with x 
the “group variable”). The noncommu- 
tativity of the quantities that we meet 
in this theory creates absolutely new 
problems, which have no analogies ei- 
ther in the theory of functions of a real 
variable (or several real variables) or 
in the theory of complex variables. 
Continuous groups, like the group (C.l) 
of motions of a plane, were first stud- 
ied in detail by the Norwegian mathe- 
matician Sophus Lie (1842-1899) and 
are now called Lie groups . During the 
past two decades, Lie group theory has 
acquired enormous significance for the 
classification of elementary particles in 
modern physics and for an understand- 
ing of particle interactions. 

The outstanding importance of the theory 
of groups and, in particular, Lie groups, for 
modern physics prompts us to dwell briefly 
on the history of the origin of these depart- 
ments of mathematics. C 16 We have already 
spoken of the pioneering work done by the 
brilliant young Frenchman Evariste Galois 
in the theory of equations (his predecessors 
were the famous Lagrange, the Italian Paolo 
Ruffini (1765-1822), and the distinguished 
Norwegian mathematician Niels Henrik Abel 
(1802-1829)). C17 Galois’s work was not ap- 
preciated during his lifetime; what is more, it 
even failed to gain notice. Galois wrote letters 
to the famous French mathematicians Cauchy 
and Simeon-Denis Poisson (1781-1840). Pois- 
son was unable to grasp the importance of 
Galois’s ideas, which were far ahead of his 
time, while Cauchy evidently did not even read 
the letter sent to him by an unknown youth. 


c * 16 For more details see I. M. Yaglom, 
Felix Klein and Sophus Lie , Znanie, Moscow, 
1977 (an enlarged English version of this book 
will be published by Birkhauser-Verlag, 
Boston, New York). 

c * 17 The difference between the work done 
by Galois and that of his predecessors con- 
sisted not only, and not so much, in that 
Galois was the first to introduce the term“group” 
and to define this concept rigorously, as in that 
Lagrange, Ruffini, and Abel worked exclusive- 
ly with commutative (or Abel) groups, where 
ab = ba for any two elements a and &, whereas 
Galois introduced general-type (noncommuta- 
tive) groups. 


Cauchy died in 1857, and in the 1860s* 
the French Academy of Sciences decided to 
publish his complete works. The prominent 
mathematician Camille Jordan (1838-1922) 
was put in charge of the project, and in con- 
nection with this he made a close study of all 
of Cauchy’s personal papers. Among them he* 
discovered the letter from Galois, which, 
30 years after it had been written, made a tre- 
mendous impression on him. Jordan painstak- 
ingly studied all the published and unpub- 
lished writings of Galois. He gradually arrived 
at the conviction that he must write a book 
about his young compatriot’s ideas. Jordan’s* 
monograph, entitled Traite des substitutions et 
des equations algebriques (A Treatise on Sub- 
stitutions and Algebraic Equations), which 
was published in 1870, was the first textbook- 
on the theory of groups to appear in the 
mathematical literature. 

In the period when Jordan was working- 
on his treatise there were among his students- 
two capable young friends, the mathematicians 
Felix Klein of Germany and Sophus Lie of 
Norway. They too took a keen interest in the 
ideas of Galois and the theory of groups. Jor- 
dan was the first to note the existence of twa 
different types of groups: discrete (crystallo- 
graphic) and continuous. The two pupils of his. 
divided these two branches of mathematics 
between themselves, so to say. Klein’s main 
achievements were in the sphere of discrete- 

g roups: a special type of discrete groups which. 

e distinguished, now known as Klein groups , 
has been arousing great interest lately. Along* 
with that, in the dissertation which he sub- 
mitted on the occasion of his admission to the- 
faculty of the University of Erlangen and 
which later came to be called the Erlangen* 
Programm he suggested a system of symmetry 
(the symmetry group) of any geometric mani- 
fold as a principle to be used in classifying the- 
separate branches of mathematics. According 
to Klein, the Lobachevsky space differs from 
the conventional Euclidean space not only 
because its parallel lines have different prop- 
erties (a secondary criterion which does not 
explain much) but also because it has a totally 
different symmetry group. Later this prin- 
ciple was applied in physics, and today we are- 
inclined to believe that, for instance, Ein- 
stein’s special theory of relativity differs* 
from Newton’s classical mechanics in that 
the former has a different symmetry-group« 
structure in four-dimensional space-time; more- 
over, in “Einstein’s world,” this group is 
arranged so that the difference between the* 
spatial and temporal coordinates (see Sec- 
tion 9.8 and the book mentioned in footnote* 
9.12) is somewhat diminished. Klein’s prin- 
ciple of symmetry is of great importance to the 
whole of modern physics. 

Sophus Lie was that rare type of scholar 
who devoted his entire life and all of his pro- 
digious research and writings (six very large* 
books and a multitude of articles which were* 
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later collected in his multivolume Collected 
Works) to the theory of continuous groups 
( Lie groups). He worked out a wide-ranging 
theory of continuous groups. He transferred 
to differential equations the results that 
Galois had obtained for algebraic equations: the 
main difference here was that the symmetry 
group of an algebraic equation (the Galois 
group) is finite, whereas the symmetry group 
of a differential equation (the Lie group) is 
continuous. Lie’s theory of the groups of differ- 
ential equations (in particular, the classi- 
fication of differential equations via their sym- 
metry group) has attracted much attention of 
mathematicians and physicists in recent de- 
cades. 

Lie did not live to the 20th century and, 
-unfortunately, could not see the rapid flow- 
ering of his theory; for one thing, he did not 
know that the theory of continuous groups 
would be applied in physics. The group-theory 
period in the history of theoretical physics 
began with in-depth investigations by Nobel 
Prize winners Max Born and his pupil W. Hei- 
senberg, both of Germany, and P. A. M. Dirac 
of Britain. Herman Weyl, G * 18 a pupil of 
D. Hilbert CA 9 (already mentioned on many 
-occasions) , both of them the most distinguished 
mathematicians of the 20th century, and 
the physicist E. Wigner, were also among the 
trailblazers of that period. 

The theory of symmetry and the theo- 
ry of groups play a very important 
role in the most fundamental and dif- 
ficult part of physics, the theory of 
-elementary particles. 

For a long time this theory developed 
in one direction: physicists “split” com- 
plicated objects and discovered the sim- 
pler parts that comprised them. It thus 
appeared that molecules consist of 
atoms, atoms consist of nuclei and elec- 
trons, and nuclei consist of protons 
and neutrons. Today we would add 
that protons and neutrons consist of 
quarks. The evolution of physics was 
reminiscent of the way a child takes 
apart a nest of Russian painted wooden 
dolls and discovers a smaller doll in- 
side each one it opens up. In each new 
layer, physicists saw a simpler picture: 
all the countless types of molecules 


0,18 See M. H. A. Newman, “Herman 
Weyl...,” in Biographical Memoirs of Fellows 
of the Royal Society , 1957. 

u,ly See the informative book by G. Reid, 
Hilbert , Springer, Berlin, 1970. 


proved to be combinations of approxi- 
mately 100 types of atoms having ap- 
proximately 1000 different nuclei. 
However, all these many nuclei were re- 
duced to only two types of elementary 
particles, that is, protons and neu- 
trons. But by about the middle of the 
20th century an opposite trend appeared 
in theoretical physics — one toward an 
ever more complicated picture of the 
Universe. A very large number of ele- 
mentary particles, more than 100, have 
been discovered in cosmic rays and 
also with the aid of accelerators. All 
these particles are just as “elementary” 
as the proton or neutron. Here the nest- 
of-dolls model came into operation 
again: many particles are now describ- 
ed as consisting of simpler particles, 
quarks, which have not yet been ob- 
served. At present, however, we are not 
only, and not so much, preoccupied 
with the question of whether “still 
more elementary particles” exist. The 
establishment of symmetry— that is, a 
resemblance between various particles — 
leads to extremely important conclu- 
sions about their properties. The inter- 
action of particles with one another, 
the possibility of the transformation of 
certain particles into other particles, 
and other properties are very closely 
connected with symmetry. The secrets 
of quarks, electrons, and neutrino can 
perhaps be revealed by a deeper study 
of their symmetry. And today the the- 
ory of symmetry plays a cardinal role 
in all of the investigations that are 
being conducted. And so physicists na- 
turally take a great interest in all 
aspects of the theory of symmetry and, 
first and foremost, the theory of groups 
(discrete and continuous). 

Not long ago, in a serious article, we 
came across the following: “Since the 
theory of groups is now taught in 
high school, persons who recently 
finished school can skip the next sec- 
tion. However, older physicists should 
give it their attention in order to be 
able to discuss it with their children 
and do deatn ‘ tneir "Snftiertts?- ^Vv e diave 
not yet decided to include elements of 
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the theory of groups in our mathemat- 
ics textbook for students who are be- 
ginners in physics, but we may yet have 
to do so. 

An absolutely new situation has 
arisen in mathematics and mathematical 
physics with the appearance of com- 
puters. They range from pocket calcu- 
lators to computer complexes, each unit 
of which handles a separate stage of the 
operations. The functioning of these 
complexes can be compared in a way 
to the higher nervous activity of man, 
in which the left hemisphere of the 
brain (which controls speech) and the 
right hemisphere (which handles visual 
impressions) perform different functions. 
Computers have significantly changed 
the approach to problems in higher 
mathematics in that they often make it 
simpler to apply an approximate brute- 
force calculation than to use complicat- 
ed mathematical models. Furthermore, 
computers have given rise to absolutely 
new branches of mathematics. 

One occasionally hears the remark 
that “mathematics is a mill that grinds 
up only what is put into it.” In this 
way poor results are explained by the 
fact that the original premises were 
faulty. In reality, the mill quite often 
turns out much more than is put in 
and the results are sometimes totally 
unexpected! 

Fundamentally, mathematics can be 
regarded as a variety of refined logic. 
The remarkable thing is that having 
set up the rules of the logic and learned 


them, man has at his disposal a more 
powerful tool than ordinary “common 
sense,” which is based on traditional 
logic. 

Using his hands, man makes simple 
tools with the aid of which he makes 
machine tools; with the aid of these 
he constructs more complicated devi- 
ces, and using these more complicated 
devices he does things which he could 
not do with his hands alone. Mathemat- 
ics is very much like that. It devel- 
ops more and more complicated theo- 
ries, introduces fresh notions and en- 
ables us to comprehend and master the 
most unusual phenomena of nature. 

At the end of the course of mathema- 
tical physics we can again start a new 
chapter entitled “What Next?”, but do 
not lose heart, for not far off is the 
boundary line that marks the end of study 
and the beginning of creativity and 
the development of new theories. 

We have tried to picture the reader 
who in a few moments will close this 
book with a sign of relief. Most likely, 
you are finishing school or are in your 
first year at college. May mathematics 
always remain for you an exact and 
beautiful language, a means of express- 
ing ideas, a way of thought. May math- 
ematics be more than merely another 
subject that must be “passed” at an 
examination and then left behind with- 
out a trace. Love mathematics — 
and mathematics will love you and help 
you in your work. 
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Appendix 1. Derivatives 


1. y = c, -SL = 0. 

* 9 dx 
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y ’ dx 
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** ^- e ’ dx e • 

5 . y = a x , ■— a x In a ~ 2 . 3 a K log 10 a. 
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9. y = cosx , = — sin x. 

10. y = tanx, -g— -5^7. 

11. y = cotx, ■%-=-- - 4 r. 
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14. t/ = arc cot x, 

15. z/ = aiccotx, 



sin 2 x 


1 

dx 

1— a; 2 

dy 

1 

dx 

<M 

1 

V 

dy _ 

1 

dx 

1+* 2 * 

dy _ 

1 

dx 

i+* 2 • 


Appendix 2. Integrals 
of Some Functions 


1. j dx — x-t-C. 

2. jV<£r = -|^- + C (agt— 1). 

3. j-f-=ln|*| +C. 

f dx 1 i i \ u\ \ n 


5. ^ a x dx= + C. 

6. ^ e kx dx = ~ e kx 4- C. 

7. f xV* dx = i- x n e fex — 4 f dx. 

J K K j 


q r & _ 1 i 
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9. ^ sin ax dx 

ekx 

— -j c 2 j ra 2 (& s i n — a cos ax) + C. 

10. j e kx cos axdx 

— — (k cos ax-\-a sin ax) + C\ 

11. j sin kxdx= — ^-cos kx-\-C. 

12. j cos/cxdx = -“Sin &x-f C. 

13 - Jiro'- = --T cotfe + C '- 

14 = ^tankx + C. 


15. \ sin 2 /cx dx = ' 


-sin 2/cx J r C. 


16. j cos 2 A:xdx = -|- + -^-sin2/^:x + C , . 

1 x n 

x n sin kx dx— — cos kx 


+ -y j x r ‘~ 1 cos kx dx. 

18. j x n cos Arx dx = sin kx 

— ^ j :rTl ” 1 s * n d x * 


19. \ sin fcx sin lxdx = 


sin ( k—l)x 
2(k—l) 
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= |Z|, see No. 15). 

20. j cos kx cos lx dx = ^(fe — ij 


ax-\- b a 


In |ax + &l + C\ 


+ Si 2(fe+r +C if |A:| ^ UI (if •*' 

= |Z|, see No. 16). 
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21. ( sinkxQ,oslxdx= — C °o^T!\ * 

J 2 (k + l) 

- C 2(f_ lf + C if 

22. j tanfcrd£= — ^-ln |cos kx \ + C. 
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' (n — l)(4ac — b 2 ) ) (ax 2 + bx + c) n ~ l 

for n^2 and 4ac—b 2 ^=0. 


39 - h 

-11 

4°. j-; 



1 

J a^ 2 + 6a:4-c 

= 2a 


dx 

ax 2 -\-bx-\-c 
dx 


In \ax 2 -\-bx-\-c\ 

(see No. 37). 


dx _1_ i x 2 

x (ax 2 -\- bx-\-c) ~~ 2c n \ax 2 -)rbx-\- c\ 

i ..,11^. ( a <* No. 37) 


2 c J ax^ + bx+c v ’ 

41 f d ± 

J x m (ax 2 -\-bx-\-c) n 

1 

(m — 1 ) ex 171-1 (ax 2 + bx + c) n_1 

(2^ -f- m — 3) a l dx 

(m — 1) c J x m ~ 2 (ax 2 bx -j- c) n 

(n~\~m — 2) b f dx 

(m — 1) c J a; m_1 (ax 2 -f- bx + c) n 

(for m > 1). 


1. j y ax + 1 


n{ ^+ b } v^+b+c. 

a(7z + l) r 1 ' 


/q I* dx _ 72(a;r-|-fr) 1 

J 'Y ax -\-b ( n |/ ax -\-b 

44. j ln a: dx = x ln x — £ + C\ 


dx 1 

— — n" = — ar c tan \-C. 

x 2 -j-a 2 a " 1 
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45. j (In x) n dx 

= x (In x) n — n j (In x) n ^dx. 

46. I arc sin — dx 

J a 

*= x arc sin + ]/” a 2 — # 2 + C . 

47. J arc cos ~^dx 

= x arc cos — — V a 2 — C. 

a r 

48. j arc tan dx 

*= # arc tan In (a 2 + # 2 ) + C. 

49. j arc cot dx 

— a; arc cot — - + In (a 2 + £ 2 ) + C. 

Appendix 3. Series Expansions 

1. (l + gr^l + mg + ^y) s 2 

-| m ( m 1 ) (m 2 ) ja ^ 

3 ! 


(— l<;r<;l if m is not a positive 
integer). 

2. sinx = a: — -gj- + -gj — - 7 f+--- 
(any x). 


O A X 2 . X 4 X 6 , 

3. cos*=l- lj --f 4 j— 6j-+ ... 
(any a;). 

4. tan;r = ;r-|--g-;r 3 + 


17 

315 


62 


2835 




r y * . x . x 2 , a ; 3 . x 4 . 

5. e — 1 + *jr + " 2 f + 3T-r 4f + • • • 
(any x). 


6. In (l + x) = x— -y- + -y- — "T+ • • • 

( — !<*<!). 


„ • , x 3 , 1-3 . 

7. arc smi=i+ + 2 ;^ ~ 5 a: 5 

+ ^ 4 3 6 5 7 ^ 7 + • •• ( — l<z<l). 

- 7*3 - 7*5 7*7 

8. arc tan x = x — + =-+... 

( — i < *< i ). 


Appendix 4. Numerical Tables 


TABLE A4.1 


X 

** 


X 

e * 


X 

e * 


0 

1.000 

1.000 

1.5 

4.482 

0.223 

3.8 

44.701 

0.0224 

0.1 

1 .105 

0.905 

1.6 

4.953 

0.202 

4.0 

54.598 

0.0183 

0.2 

1.221 

0.819 

1.7 

5.474 

0.183 

4.5 

90.017 

0.0111 

0.3 

1 .350 

0.741 

1.8 

6.050 

0.165 

5.0 

148.41 

0.00674 

0.4 

0.492 

0.670 

1.9 

6.686 

0.150 

5.5 

244.69 

0.00409 

0.5 

1 .649 

0.607 

2.0 

7.389 

0.135 

6.0 

403.43 

0.00248 

0.6 

1.822 

0.549 

2.2 

9.025 

0.1108 

6.5 

665.14 

0.00150 

0.7 

2.014 

0.497 

2.4 

11.023 

0.0907 

7.0 

1096.6 

0.000912 

0.8 

2.226 

0.449 

2.6 

13.464 

0.0743 

7.5 

1 808.0 

0.000553 

0.9 

2.460 

0.407 

2.8 

16.445 

0.0608 

8.0 

2 981.0 

0.000335 

1.0 

2.718 

0.368 

3.0 

20.086 

0.0498 

8.5 

4 914.8 

0.000203 

1 .1 

3.004 

0.333 

3.2 

24.533 

0.0408 

9.0 

8103.1 

0.000123 

1.2 

3.320 

0.301 

3.4 

29.964 

0.0334 

9.5 

13 360 

0.000075 

1.3 

1.4 

3.669 

4.055 

0.273 

0.247 

3.6 

36.598 

0.0273 

10.0 

22 026 

0.000045 
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Appendix 5. The International System, or 

SI 


Quantity 

SI unit 


Name 

Dimensions 

Name of unit 

Unit symbol 


Base units 


length 

m 

meter 

m 

mass 

kg 

kilogram 

kg 

time 

s 

second 

s 

electric current 

A 

ampere 

A 

temperature 

K 

lei v in 

K 

luminous intensity 

cd 

candela 

cd 

amount of substance 

mol 

mole 

mol 


Additional units 


plane angle 

— 

radian 

rad 

solid angle 

— 

steradian 

sr, sterad 


Derived 

units 


velocity 

m/s 

meter per second 

m/s 

acceleration 

m/s 2 

meter per second squared 

m/s 2 

angular velocity 

1/s 

radian per second 

rad/s 

angular acceleration 

1/s 2 

radian per second squared 

rad/s 2 

force 

kg-m/s 2 

newton 

N 

pressure 

kg/m -s 2 

pascal 

Pa 

work, torque, energy, 




quantity of heat 

kg-m 2 /s 2 

joule 

J 

power, heat flux 

kg-m 2 /s 3 

watt 

W 

heat capacity (specific) 

m 2 /s 2 -K 

joule per kilogram kelvin 

J/kg-K 

momentum 

kg-m/s 

kilogram-meter per second 

kg-m/s 

impulse 

kg-m/s 

newton-second 

N-s 

frequency 

1/s 

hertz 

Hz 

quantity of electricity 

A-s 

coulomb 

C 

electromotive force, po- 




tential difference 

kg-m 2 /A-s 3 

volt 

V 

electric field strength 

kg-m/A -s 3 

volt per meter 

V/m 

electric resistance 

kg-m 2 /A 2 -s 3 

ohm 

Q 

electric capacitance 

A 2 -s 4 /kg-m 2 

farad 

F 

inductance 

kg-m 2 /A 2 -s 2 

henry, weber per ampere 

H, Wb/A 

magnetic flux 

kg -m 2 / A-s 2 

weber 

Wb 

magnetic field strength 

A/m 

ampere per meter 

A/m 

magnetic flux density, 


tesla, weber per square 


magnetic induction 

kg/A-s 2 

meter 

T, Wb/m 2 

magnetomotive force 

A 

ampere 

A 


Appendix 6. Greek Alphabet 


A a 

Alpha 

It 

Iota 

Pp 

Rho 

BP 

Beta 

K x 

Kappa 

So 

Sigma 

r? 

Gamma 

A % 

Lambda 

Tt 

Tau 

A 5 

Delta 

M p, 

Mu 

Tv 

Upsilon 

Ee 

Epsilon 

Nv 

Nu 

® cp 

Phi 

Z£ 

Zeta 


Xi 

x x 

Chi 

Hr, 

Eta 

Oo 

Omicron 

w ^ 

Psi 

00 

Theta 

IT n 

Pi 

Q co 

Omega 






Hints, Answers, and Solutions 


Part 1 
Chapter 1 

1.2.4. r = /2, a = 45°; r = 2]/T, a = 
—45°; r = 3 1^2, a _== —135°; r ==_4/2, a = 
135°._1.2.5. 2±2V 2; 2y r 2; lY 2. 1.2.6^ (0, 
a/]/ 2); (alY 2, 0); (0, —a/Y 2); (— a// 2, 0) 
(make a drawing). 1.2.7. (a, 0); (a/2, a]/"3/2); 
( — a/2, a]/* 3/2); (-a, 0); (—a/2, -a/ 3/2); 

(a/2, — a]/" 3/2) (make a drawing). 1.2.8.__(a) 
Two cases: (—a/2, 0), (a/2, 0), (0, ±aj/ 3/2). 
(b) Four cases: (0, 0), (a, 0), ( a/2 ^ zta]/^ 3/2) 
and (0, 0) (—a, 0) (—a/2, ± a/ 3/2) (make a 
drawing). 1.2.9. A 2 (x x , —y x ), A* (— x u y x ), 
A 4 (— a?!, —y x ) (make a drawing). 

1.3.3. (a) y = — x\ (b) y = 2x — 1. 1.3.4. If 
the straight line intersects the coordinate 
axes at the points M and N , then a and b taken 
with the appropriate signs are the lengths of 
the line segments OM and ON. 

1.4.2. Use the fact that x 2 — 2x + 2 = 
(x — l) 2 + 1; 2x 2 + 4x = 2 (x + l) 2 - 2. 

1.4.3. The stretching from the origin (homo- 
thety) with a coefficient Y k transforms point 
M (x, y)_to point M' ( x ', */'), wither' = Y k x, 
y' =Yk y- Whence, x'y' = kxy = k. (b) The 
stretching along the x axis with the coefficient k 
transforms point M (x, y) to point M x (x x , y x ), 

with x x — kx, y 1 — y( whence x = —x x , y = 

\ k 

y±). 1.4.4. We must prove that for k, x lt x 2 >> 0 
the following inequality holds true 

J_ / _L Jl ) > !l or £i +^ > 

2 \ x 1 x 2 / (^i + ^2)/2 ’ 2x x x 2 

2 

Xi + X 2 * 

1.5.1. (c) The graph of the function y = 
a: 4 x 2 can be obtained by the “addition” of 
the graphs we have known of the functions 
y x — a: 4 and y 2 = x 2 ; in shape it resembles these 
graphs. 1.5.3. See Figure H. 1 a-d correspond- 
ing to the limiting cases n -> oo; for large n ' s 
the graphs are close to those given in Fig- 
ure H.l. 


1.6.1. (a) y~~^ 2; (b) y = ± Yx + 2; 

(c) *={'£+! -1; (d) y = ±V lAJ+l-l. 

1.6.3. (a) y = (b) y = 

/ a: — (c — 6 2 /4a) 6 , x 

a (C) " = 

da: — b 
— cz-j-a 

1.7.8. Here we can easily prove in a purely 
formal manner that x and y appearing in the 
conditions for the substitution of the variable 
preserve the sign of aB (see the solution to 
Exercise 1.7.9). from which it immediately 
follows that the curves (1.7.10a), (1.7.10b). 
and (1.7.10c) cannot approach one another. A 
more interesting (“geometrical”) reasoning 
is also possible here. It is easy to see that neither 
the translation of the coordinate origins, nor 
the change in scales along the x and y axes 
(which is equivalent to uniform stretching 
of the curve along the axes) affect the presence 
or absence of (local) maxima or minima of the 
curve y = f (x) and also the presence or ab- 
sence of salient points at which the tangent is 
horizontal. From this and from the fact that 
the curve (1.7.10c) has two maxima (the mo- 
no tonically increasing curves (1.7.10a) and 
(1.7.10b) have no maxima) and the curve 
(1.7.10a) has the salient point (the tangent at 
which is horizontal)— the origin— the requi- 
red assertion follows immediately. 1.7.9. At 

B = c — — = 0 the relation (1.7.9) re- 
3 ct 

duces to the form (1.7.10a); if B 0, this 
relation reduces to the form (1.7.10b) for 

aB = ac — — >> 0. and to the form (1.7.10c) 
3 

for ac — — < 0. 

3 

1.8.1. (a) The curve is represented in Fig- 
ure H.2. 1.8.2. The circle x 2 + y 2 = 1. 

1.8.3. An ellipse (Figure H.3). 1.8.4. The line 
segment of the straight line y = x between the 
points (—1, —1) and (1, 1). 1.8.5. x = 

r arc cos ~V2n / — i/ 2 . 1.8.6. (a) The 



Figure H.1 
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Figure H.2 



Figure H.3 


equations of the epicycloid x — (R + r) cos t— 

r cos i H-T g, */ = (i? + r) sin i — r sin 
r r 

(b) The equations of the hypocycloid a: = 

(i? — r) cos t + r cos — - ~~ r t, y = (R — r) X 

r 

d r 

sin / — r sin 1. Figure H.4 a-d shows 

r 

respectively: the epicycloid for R = r (car- 
dioid); the hypocycloid for r — (1/3) R (the 
Steiner curve); the hypocycloid for r = (1/4) R 
(astroid); the hypocycloid for r = (1/2) R (the 
diameter of a circle of radius R). 

1.9.1. (a) y = — x; (b) x = 0; (c) y = x + 
2; (d) y = (—4/3) x. 1.9.2. For the first triple 
of points: AA 2 A t A s = 90°; SaajAzAs = 25; 

= 2/ 5. 1.9.3. (a) 2/ 2. 1.9.4. (a) 
3y~ 2/2. 1.9.7. If y = fcc + s is the equation 
of the straight line, then in the new coordi- 


nates x ' , y' related to the old coordinates x , 
y by formulas (1.9.6) the equation of the same 
straight line will have the form y' = k'x' + s', 
where 


k cos q'-f-sin <%' s-j-fca' — 6' 

cos a ' — k sin a' ’ s cos a' — A: sin a'’ 


here a' = —a = /_x'Ox is the angle between 
the O'x' and Ox axes, and a' and 6 ' are the 
coordinates of the new origin O' in the old 
system xOy . 


Chapter 2 

2.1.1. (a) t- av = 3* 2 + 1 + 3 tAt + (A *) 2 ; 
y inst = 3f 2 + 1. 2.1.2. y av = t> jnst . 

2.2.1. c av = 0.3965 + 4.162 X 10" 3 T — 
15.072 X 10- 7 T 2 + (2.081 X 10" 3 — 15.072 
X 10“ 7 T) AT — 5.024 X 10“ 7 (AT) 2 ; c lrjst = 
0.3965 + 4.162 X 10~ 3 T — 15.072 X 10 - 7 7* 2 ; 
c (0) = 0.3965; c (100) = 0.7977; c (500) = 
2.1007. 2.2.2. cmst = 4186.68 + 16746.72 X 
10- 5 T + 3768 X 10- 6 T 2 . 

2.4.1. (a) y f — 4a: 3 ; (b) y' — 12x 2 — 

6a; + 2; (c) y' = Sx + 4; (d) y' = —2/a; 3 ; (e) 
y' = a (1 — 1/a; 2 ); (f) y' = 2 ax — 26/a; 3 . 2.4.4. 
(1.2) 2 = 1.44; if y (x) = a; 2 , a; = 1, Aa; = 0.2, 
then y (x) + y' (x) Ax = l 2 + 2 X 0.2 = 1.4; 
the error is 3%. Similarly, (l.l) 2 = 1.21 and 
(l.l) 2 ~ l 2 + 2 X 0.1 = 1.2 (the error is 1%) 
and so on. 

2.5.2. The tangent is horizontal at the 
points x — 0 and x = 2. 2.5.3. See Figure H.5; 
the tangent is horizontal at x = ±1/1^3 
2.5.4. See Figure H.6. 2.5.5. See Figure H.7. 

2.5.6. (a) y = (3/4) a; - 1/4, (1/3, 0), (0, 
—1/4); (b) y = 3a; — 2, (2/3, 0), (0, —2). 

2.5.7. The tangent to the parabola (a) at the 
point (x 0 , y 0 ) intersects the coordinate axes 
at the points (x 0 /2, 0) and (0, — */ 0 ); the tan- 
gent to the cubical parabola (b) intersects the 
coordinate axes at the points ((2/3) a; 0 , 0), 
(0, -2 y 0 ). 

2.6.1. (a) x = 0; the minimum at a ;> 0, 
the maximum at a < 0; (b) x = — 1, the ma- 
ximum; x = 1, the minimum; (c) x = — j/" b/a 
(b/a >» 0), the maximum at a >* 0, the mi- 
nimum at a < 0, x = y b/a (b/a >» 0), the 
maximum at a < 0. the minimum at a > 0. 
There is no maximum or minimum at b/a c 0; 






(b) 



(c) 



(d) 


Figure H.4 
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Figure H.5 



Figure H.6 



Figure H.7 


(d) x = — 1 , the maximum; x — 1 , the mini- 
mum; (e) a 3> 0 , x — 0 , the minimum; a = 0, 
.x = 0, the minimum; a < 0, x = 0, the max- 
imum, x — — Y — a/2, the minimum, x = 
Y —a/2, the minimum. 2.6.2. t — 4°. 

2.6.3. (a) t ~ 4.08°, (b) t ~ 3.92°. 

2.7.1. 2; 6a:; 12a: 2 ; 2a. 2.7.2. The acceleration 
equals 2a, it is constant. 2.7.3. (a), (c) The 
curve is convex at x < 0, concave at x > 0; 
■(b) convex at x <c — p/3, concave at x ]> — p/3. 


Chapter 3 

3.2.2. Below are given the values of the 
sums when the interval is partitioned into m 


parts (m = 

10, 20, 

50, oo): 



m 

10 

20 

50 

OO 

At 

0.1 

0.05 

0.02 

0 


2.18 

2.26 

2.30 

2.33 

s4xm 

2.49 

2.41 

2.37 

2.33 


As seen, even at m = 50 both sums differ but 
little from the limiting value at m = oo. 


For an exact evaluation of the integral 
(the limiting sum corresponding to the value 
m — oo) we write 


_ 2m 2 + 3m+i 1.11.11 

6m 2 3 ' 2 m 6 m 2 ' 


2 

whence it follows immediately that ^t 2 dt = 

l 

-|-(~0.33). 3.2.3. Let q = ™' r 2; we denote 

o 

30 = 1, Xi = q, x 2 = q 2 , x m = q m = 2. Fur- 

ther 2 arfA = 2 ( x l) s ( x l~ «Z-i) = S (^ Z ) 3 X 
l l l 

m 

(qi—qi~ 1 ) = (g—i) 2 q il ~ 1 =(q— l)(g 3 +g 7 + 

i=i 


9 U + ••• + ? 4m " 1 ) = (?-!)• 
g 3 [(g m ) 4-l] 
g 3 +g 2 +?+l 


^ 4771+3 a S 


g 3 (2 4 — 1) 
g 3_|_ g 2 + g + l 
2 


g 4 — 1 


; sending 


5 ’ 15 

x 3 dx — = 


3 y = 3.75. 


3.4.1. (a) 1/3; (b) (1.1*— 1) c* 0.11033; 

(c) 1/2; (d) 2 (/3— 1) c* 1.464. 3.4.2. (a) S= 
b > 

j y (x) dx, with y=y(x) = — x the equation 
o 

b 

j -> ^ 

— sda: = 

0 


u 


xdx 


JL — 

b 2 


y 


(b) 5 - 



2 

y 


3.4.3. (a) 5= 
3.4.4. 5 = 


r 



r 2 #2 


3.4.5. 0.7837 for n = 5; 


0.7850 for n = 10. 3.4.6. See Figure H.8. 

3.4.7. See Figure H.9a-c. 
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Figure H.8 



Y 


(b) (c) 


3.6.2. It does not change. 3.6.3. (2/3) R 3 X 

i 

tan a. 3.6.4. yY 1 — y 2 dy = 1/3; note that 

the area of the cross section of the “hoof” 
(Figure 3.6.8) with the plane, which is per- 
pendicular to the plane ABP , parallel to AB, 
and distant by y from AB , is equal to 

2 y V 1 - V 2 - 


Chapter 4 

4.2.1. (a) y' = (. x 2 )' a: 2 + a: 2 (a: 2 )' = 2X 
2a: X x 2 = 4a: 3 ; (b) y' = 5aa: 4 + 4&a: 3 + 3ca: 2 + 



4.3.1. 


dz _ dz dy 
dx dy dx 
z' = ( ax 2 -f- 2abx + b 2 )' 

4.3.2. (a) z'=? — 

2a 


(aa: + 6) 3 

2a: (a: 2 + 4a: + 5) 

; 

fa 2 

4.3.3. y" = ±1- 


(ax + b ) 2 
(c) z' 1 


2ya = 2a (ao: -f- b); 
2 a 2 x -f- 2 ab. 

(b) z' = 


(X + 1)2 


(d) y' = 


x 2 — 2a: — 2 
(r 2 -4- W 
• 2/'gg' + 2/ (g')» - /gg" 


(e) </' 


4.4.1. (a) y' 1 


|A( 2 x + 1)2 ’ 


(b) y' = 


(x+1) Y Z 2 — 1 


1 1 
(c) 

4 zyz 


4 - 4 - 3 - W 5-= 

5 1 d 2 y _ & 1 

16 jij i ’ ° dx 2 ~ a 2 sin 3 * • 

4.5.1. (a) y' = 5x4 — 12x 3 + 3x 2 + 14x-2; 

(b) j/' = 2(x 3 +x+l) (3x 2 + l); (c) y' = 4(x 2 - 
x + l) 3 (2x — 1); (d) y' = 10 (3x 2 — 1)» 6x- 

(e) y' = 1 — - ; (f) y = x 2 /*>, y' — -jr- X 

v a: 2 — 1 0 

-e 4=- . 4.5.3. (a) If a: changes by 1%, then 
y a: 3 

Ay = n X 0.01 z/; therefore when a: changes by 
k%, we have Ay = 72 X O.Oly/c. In the given 
case x changes by 10% an Ay = nX0Ay. 

Since 72 — 1/2, Ay = O.ly = 0.05?/. There- 
fore y (11) - y (10) + 0.05X 5 = 5.25; y( 9) ~ 
y (10) — 0.05 X5 = 4.75. Let us now obtain the 
exact solution. We denote the proportionality 
factor by k; then y = kY x • Since at x = 10, 
7=5, we have 5 = Zcj/^10, whence k ~ 1.58 
(the calculations are carried out to two decimal 
places) .^Therefore y == 1.58 j/^o:; y (11) = 
1.58]/TI ~ 5.24; y (9) ~ 4.74. (b) The ap- 
proximate values are y (11) ~ 4.50; y (9) ~ 
5.50. The exact values are y (11) ~ 4.54;; 
y (9) ~ 5.56. (c) The approximate values are 
y (11) ~ 6.00; y (9) ~ 4.00. The exact values 
are y (11) ~ 6.05; y (9) ~ 4.05. 

4.6.1. y f —x 2 (x 2 — 1) (7a: 2 — 3). 4.6.2. y' = 


3a: 2 Y x2 ~ \~ x 


(2a: — [— 1) x s 

2 Y x2 ~t x 


( 


(8a: +7) x s \ 

2 Y x2 + x ' 


4.b.3. y = 5a: 4 y 


x 2 — i 


• 2xx b (x s — 2a:) 1 / 5 



( x 3 — 2a:) 1 / 5 
x s — 2a: 


X 


(3a: 2 — 2) x 5 \ x 2 — 1 (simplify). 4.6.4. y' = 

r ~ 2 + (* + ~k) y - 


3x 2 


2 Y X s — 2 
1/3 j/g+1 


4.6.5. y' = 2x]/ r yx-{-x -|-x 2 X. 


* (simplify). 4.6.6. y' = 5 (f/x-f- 
2 Y y x+x v 

1 \4 / 1 i | 




_\ 4 (_U_ 

x / ' 3 y^x 2 


1 \ 5 

ft ) • 4 - 6 ' , ' ! ' = 


3 Y x * 
x 2 + l 
(x 2 -i) 2 


x + [V x + 

4.6.8. y' = 


1 — a: 2 


4.6.9. y' 


* (x 2 —x + l) 2 * 

4.6.10. y' = ~ i2 ~~ Y^T~2 


x 2 + 2a: + 5 

(*+i) 2 * 

3 (3x — 1) 
2x 2 Y * 3 +2~ 
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4.6.11. y'= - 

2x + 3 

3 (£+1)4/3 • 


4.6.14. y' 


(x 2 — l) 3 / 2 * 
4.6.13. = - 

6a: y'^+l 


6 j/"x 2 Y x 2 + j/" a: 


4.6.12. y' = 

4a: 3 Y_ x 
iY x*+x y'x 

. 4.6.15. y' = 


— tf! ^T -— • 4.6.16. y'=*/'(2x+3) 2 + 
(1+^ 2 ) 3 

j - 3/ , 4 -.-.— . 4.6.17. i,' = 3a: 2 Yx - 1 + 


3y" 2a: +3 

X 3 — 1 


2 V"" a: — 1 


^55=1. 


2a: 2 


3y (a: 2 — l) 2 


. 4.6.18. y' = 


_ 2x2 4.6.19. y' = ]/|±| 

3 (a; — 1) 3 |/^ 2x — 3 K 1 


X 


(a;+l) 2 ■ 

4.6.21. y' = 
1 


4.6.20. y' = 


Ax 3 — 10a: 2 — 22a: — 11 


3 (a: — 2) 2 >/>+l) 2 
- x 5 -f- 2a: 3 -f- 2x 2 — 1 

(x 3 + 1) 2 Y x<t — i 

x + 1 \ 2/3 -I- 2a: 


4.6.22. ( ,.^ +r) 

4.6.23. j/' = /i 2 ^T 3 /a:+/5 + 

— (1+2 /x) a:]/ a; 2 —! 


(a: + l) 2 ' 


/a 


-1 


•X 


3 A+/5+ ^V — 4.6.24. y'= 

6 V a: / (x+y a:) 2 

i( 

2a: 


_j_r 6/7 (i i—\ x *+ 

Yx) l 2Yx 3 > 


( x + -yt) 


,1/7 


Y x 3 

4.6.25. y' = 


V a: 

1— 6^ — 4a:|/~a 2 
3(x + l)f/x 2 (x+1) * 

4.7.1. y' ~ 2.3 log 2 X 2*. 4.7.2. y' ~ 

2.3 log 5 X 5* +1 . 4.7.3. y'~ — 2.31og2 (y)*. 


4.7.4. y' ~ 2.3 X lO 1 ^ o V- . 4.7.5. y' ~ 

2 y a: 

2a: X 2.3 log 2 X 2**. 4.7.6. y' ~ (l — ^-) X 

2 3 log 2 X 2* +1 /*. 

4.8.1. (a) y' = — «-*; (b) y' = 5e»— 3e 3 *; 

<c) y' = 2xe xi \ (d) y' = 1 e*; (e) y' = 

z y x 

3 (a: 2 — 1) e x3-3;x:+1 . 4.8.2. More than by a 

factor of 150 000. 

4.9.1. log 5 15= log15 - ~ 1.6825. 4.9.2. (a) 

& log 5 

Differentiating both sides of the equation 
(the logarithms are natural) we find --- = 


— — J-— , whence (uu) r = u'v-{-uv\ (b) Differ- 
u v 

entiate term-by-term the equation ln(w/y)= 
ln^ — lnz;. 4.9.3. (a) y' = ^ = 

-i-- (2a:)' = -i- , or y = In 2a: = In 2 + In a:, the- 

1 2a: 

refore y'=(ln 2)'+(lnx)' = — ; (c) y 

= TTT ; (e) y ' = isfe rr' 

( f ) »'=-*r=r; or) ^'=6^ipij5 < h) »'= 

In a: + 1; (i) y' = 3a: 2 In (x + 1) + - q— j - ; 

(j) since In y = x In x, differentiating we find 
\ 

— y’ = In a:+l, whence y' = y (In x-\- 1), or 

y ‘ 

y' = x*(lnx+l); (k) y' = -i-(/5) V " xa " 1 X 
/ x In t _ _|_ j£x 1_\ ^ gee ^ jjj nt tQ (j))_ 

\ Y x 2 — 1 * / 

4.10.1. y' = 2 cos (2x + 3). 4.10.2. y' = 

— sin (a: — 1). 4.10.3. y' = — (2a: — 1) sin(a: 2 — 

a: + l). 4.10.4. y' = 2 sin a; cos x. 4.10.5. y' — 
3 cos 3a: cos 2 x — 2 cos x sin x sin 3a:. 4.10.6. y' — 

(sin 2a:) x ( In sin 2a: -f- C0 ^ ? X - \ (cf. Exer- 

else 4.9. 3j). 4.10.7. y' = tan x + — g2 — . 


1 1 

' 2 sin 2 (x/2) ' 
4.11.1. y' = — 

y i 

1 4.11.3 y' 


4.11.2. y' = 

-7—, — 5 . 4 . 11.0 y = 7 = =. . 4.11.4. y'= 

1 + x 2 1— 4x 2 

3 , , . , 2x — 1 

9x 2 +6x+2 * 4 * ' ' V x 4 — 2x 3 +x 2 + l ’ 

arc tany* 


4.11.6. y'= 

2Yx(x+1) 

4.13.1. —1; 1. 4.13.2. — 1. 


Chapter 5 

5.2.1. 1/6. 5.2.2. 1/2. 5.2.3. 2. 

5.2.4. ■— ( = arc tan 1 — arc tan 0 = 0 j . 

5.2.5. e 5.2.6. 1. 

e 
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3_ 

4 

i_ 

2 


5.3.1. z 8 — z 2 + * + C. 5.3.2. ~ x* — 

5 

■7*3 -7*2 A O 

^ 4 +^— 5.3.3. -La*-- 1.*»+ 
x 2 -\~C. 5.3.4. -y a; 2 -j- 2x -J- 3 In \x\-\-C, 


5.3.5. 2x + In | a; — 1 | + C. 5.3.6. — x + 
bj ^l«\ C x+d\ + C. 5.3.7. H ^_ 3r 

x —~ 2 + ~ _l2> » tliei1 — 3) + ^(a; — 2) = o:, 

i.e. ( A-\-B)x — {2A-\-2B) = x and ^4 + 5=1, 
3 A + 2B = 0; therefore A — — 2, B = 3. 
Answer. — 2 In | x — 2 | + 3 In | x — 3 | -f- C. 
5.3.8. — In \x — l|+ln \x — 2\+C. 5.3.9. If 
x-\-i _ s-f-l A , B 

x 2 — 3a;+3 (a; — 1) (x — 2) a; — 1 ‘ a; — 2 ’ 

then A (x — 2) + B (x — 1) = £ + 1, whence A — 

— 2, 5 = 3 (in the previous equality first 
assume that a:=l and then x=2 ). Answer. 

— 21n |s— l|+31n \x—2\ + C. 5.3.10. If 

x -\-2 -4 * B . 5a: , 

*(a: 2 + l) ~ V - * a: 2 — j— 1 + a: 2 + l * then 
>1 (a: 2 +l) + 5a:+^ 2 = ^ + 2, i.e. 4 = 2, 5 = 1, 
C = — 2. Answer . 2 In | a; | + arc tan a: — In (a: 2 -)- 
1) + 5. 

5.4.1. cosa: + a:sina:+5. 5.4.2. a: (In \x\ — 
l) + 5. (Hint. In formula (5.4.3) put / = lna:, 

dg=dx). 5.4.3. -y £ sin 2a;— f -y a: 2 — y ) X 

cos 2a: + 5. 5.4.4. ( — x 3 — 3a: 2 — 6a; — 6) + 
5. 5.4.5. (2a:+l) cos x-\-(x 2 ~\-x — 1) sin a;-f-C. 

5.4.6. Let (2a: 2 + 1) cos 3a: dx = (a x x 2 -f- ^aj-f- 
Ci) cos 3a; + ( a 2 x 2 + b 2 x + c 2 ) sin 3a;. Differen- 
tiating both sides of the equation we get 
(2a: 2 -j- 1) cos 3a; = (2aia; + b{) cos 3a: — (3 a^a: 2 -)- 
3b\X — |— 3cj) sin 3a: — J— (2q, 2 x — J— b 2 ) sin 3a; — {— (3# 2 a; 2 ~j~ 
36 2 a; + 3 c 2 ) cos 3a:, i.e. (2a: 2 -f- 1) cos 3a: = 

( — 3«ia: 2 — 3 b x x — 3ci -f- 2 a 2 a: + b 2 ) sin 3a: + 
(3« 2 a; 2 -f- 3& 2 a: + 3c 2 + 2«ia; + 6 X ) cos 3a;. Thus we 
have 2a; 2 — |— 1 = 3cl 2 x 2 — \~3b 2 x-\- 3c 2 — j- 2cliX-\- 6^, 


0= — 3«ia: 2 — 3b ix — 3 + 2 a 2 x -f- & 2 , whence 
3a 2 = 2, 3b 2 -f- 2ct± = 0, 3c 2 -}-&i= 1, — 3#i = 0, 
— 3&i + 2a 2 = 0, — 3cx + b 2 = 0. From this 
system of equations we find: ai = 0, & 2 = 0, 

ci = 0, a 2 =~, b 1 = -~ , c 2 = -~ . Answer. 

~~ x cos 3a; + ^ x 2 -| Jy j sin 3a; + C . 

5.4.7. a: a resin a; -j- l/* 1 — x 2 + 5. 

1 

5.4.8. a;arctana; — — ln(a; 2 -)-l ) + 5. 5.4.9. Set 


/ = sin 3a:, dg = e 2x dx; then integrating by 
parts yields j e 2x sin 3xdx = -^- e 2x sin 3a; — 

— ^ e 2x cos 3 xdx. In the last integral we put 
/ = cos 3a:, dg = e 2x dx and again integrate by 
parts to obtain j e 2 *sin 3xdx—^ e 2x sin 3a: — 

~~2 ( ~2 e2X C0S — \ J e2X s * n ^ xdx ^ • C° n_ 

sidering the last expression as the equation 
with the unknown j e 2x sin 3 xdx we find 

e 2x sin 3a;da: = —S 2 S1D 3* — 3 cos 3a;) 


I 


13 


+ C. 


5.4.10. — — ( C0S 2 X _+ 2 3 * a 2^) +c- 
0 

1 / 

5.5.1. — sin (3a: — 5)+5 / put 3a: — 5 = 


dx 


— dt \ 

3 ) 9 


5.5.2. — — cos (2a: + 1) + 5. 


5.5.3. ~Y(3x — 2f + C. 5.5.4. Since at 

Y x = z we have x — z 2 and dx~2zdz , then 
f X dx _ r z 2 »2z C?z _ r z 2 dz 

J x+y'5 J z 2 +2 2 J 

fH-1+i 


! +2 ” " J 1+Z 

*J 1+z dz=2 j [(z- 1) + T -p 7 ] dz = 
z 2 — 2z + 21n(l+z) + C = x— 2 Y"x+2ln(l + 
V*)+C. 5.5.5. l/'i 2 ^5 + C. 5.5.6. 


sin 4 x 


C, 

5.5.7. 


1 1 

or y cos 2 x -| — cos 4 x + C . 


1 


c. 


5.5.8. 


3 sin 3 x sin x 
— In | cos x | + C (put cos x — t). 5.5.9. 

arcsin — + C. 5.5.11. In | a;+ l/^ 2 + a; 2 | +67 . 
a 

(In addition to the method indicated in 
p. 168, we can make here an artificial sub- 
stitution x = a tan t (i.e. t = arctan (a;/a)), 
dx =adt /cos 2 t and a 2 -}-x 2 = a 2 /cos 2 t or the 
substitution Ya 2J r x2 — z — whence, squar- 
ing both sides of this last equation we get 
a 2 = z 2 —2zx (the term with x 2 cancels out in 
both sides of the equation) and thus x = 


2 z ’ 

z 2 + a 2 
2 z 

z 2 +a 2 


Y cl 2 J rx 2 = z — x=z — 
1 z2z 


dx = 


2z 2 


dz , 


2 

whence 


2z 

T- Jg).. ^z ^ 


z 2 


dx 


V a* + 
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5[ 


(z 2 + a 2 ) . z2 + a 2 


2z 2 ‘ 2z 

5.5.12. Set z = atanf, then a; 2 +a 2 
adt 




da: 


cos 2 t 


, so that 




COS 2 £ 
da; 1 


X 


(a; 2 +a 2 ) 2 

j cos 2 tdt== -^ 3 ‘ j(l+cos2£) dt= -~g- ^ + 
1 \ 

-7T- sin 2/ ) + <7. Since £ = arctan — and 
2 ] a 

2 tan t 2 x/a 


sin 2 t— Al _ — 

1-j-tan 2 £ 

we finally have 


l+(*/a) 2 

dx 


2 ax 

x 2 + a 2 

1 


arctan 


1 


2a 2 a; 2 +a 2 


(a; 2 +a 2 ) 2 2a 3 

■C. 


Chapter 6 

6.1.1- y = (aa:§ + 6a;§ + cx 0 + d) + 
(3aa;§ + 2bx 0 + c) (x — x 0 ) + (3aa: 0 + b) X 
(a: — x 0 ) 2 + a (x — a; 0 ) 3 . All the subsequent 
terms equal zero; the sum of the above four 
terms equal the polynomial. 6.1.2. Since 
y( 0) = 0, y'( 0) = 1, y”{ 0) = 2, . . ym (0) = 

n, . . y = X + x 2 +?!+...= 

21 

x ^1 + x + X —' + . . . J . 6.1.3. y = e [1 + 

(* - 1) + ^ (* “ l) 2 + i (x - l) 3 + . . .]. 

6.1.4. It is easy to verify that (1 + r) m ~ 

1 + mr + 171 ( m ~ - D r 2 (cf. Section 6.4), e mr ~ 
2 

2 2 

1 + mr + . Thus, for small r the quan- 

tity e m r differs from (1 + r) m by the term 
(1/2) mr 2 . For example, if m — 50, r = 0.02 
(see the example in p. 143), then (1/2) mr 2 = 

0. 01, i.e. the error is less than 0.5%. (When 
mr 2 is small, m can be large; here mr 3 , mr 4 , . . . 
will of course be small.) 6.1.5. (a) By formu- 
la (6.1.24), with y (0) = 1 and A y — y (Ax) — 

1, we have 


Aa; 

1 

1/2 

1/4 

1/8 

0 

t 

<1 

y (Ax) 

2.7183 

1.6487 

1.2840 

1.1331 

1 

Ay 

Ax 

1.718 

1.297 

1.136 

1.065 

1 

By 

formula (6.1.25) 

1, setting A y=y | 

(t)- 


Aa: \ 






~2 J we get 




Aa; 

1 

1/2 

1/4 

1/8 

0 

t 

4J 


y(4r) 1-6487 1.2840 1.1331 1.06451 


y ( — 4p) 0-6065 0.7788 0.8825 0.9394 1 

A y 1.0422 0.5052 0.2506 0.1251 Aar 

Ay 

— 1.042 1.010 1.002 1.006 1 


6.1.6. Substitute into the numerator on 
the right-hand side of (6.1.27) expression 
(6.1.13a) for / (a + Aa:) and / (a — Ax) 
(in (6.1.13a) set x — a + Aa: and x = a — 
Aa; respectively). 

6.2.1. Apply Taylor’s formula (6.1.18) to 
the function (6.1.1) (or Maclaurin’s formula 
(6.1.19) to the function (6.1.1a)). 6.2.2. sia; = 

c + X H- — ...» where C = 0 if 

we assume that si 0 = 0. 

6.3.1. (a) y= * +1 =l + 2x+2x*+2x* + 

1 X 

2x*+...; (b) j, = ln(l+z) = z— A + 

6.3.2. y = lna:=(a: — 1) — 


(x— l) 3 (a: — l) 4 

3 4 


In Exercise 6.3.1 


the series can be used for computations for 
| x | < 1; in Exercise 6.3.2, for 0 < x < 2, 

6.3.3. / (x) g(x) = f (0) g (0) + [/' (0) g (0) + 

g’ (0) / (0)] * + A [/" (0) g (0) + 2 /' (0) g’ (0) + 
^(0)/ (0)1 *•+... . 

6.4.2. These are the formulas (6.4.3) for 
1 

a—i and m ■ — in which we retained two 

n 

first (a) and three first (b) terms respectively. 

6.4.3. yT. 2 ~ 1.067 and ~ 1.062 (the table 

gives 1.063); f / TI~ 1.033 and ~ 1.032 (the 
table gives 1.032). 6.4.5. For x = 0 even 

(Y x)' does not exist (becomes infinite). 

6.5.1. (a) 1; (b) -1/2; (c) 1/3; (d) oo; 

(e) 1; (f) 1/2. 6.5.4. (a) Substitute t = —■ 


and consider the ratio / (A) g (-£-) ; 

use the rule (6.5.2). (b) Prove that 

- » where c is a number 

g (x)—g(x 0 ) g (c) 

intermediate between x and x 0 (why?); fur- 
ther assuming that x and x 0 are close to a 
(and are on the same side of a), first let 

x a with x 0 constant ( here — /i£o) 

0 V g(x)—g(x 0) 
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JS-) where by condition x-+a\ and then 

g(x) / 

let x Q -+-a. 

6.6.1. (a) &= — c ; (b) y*—kx*=a; 
(0 <•> 
tan ~Y = Ae sinx . 6.6.3. y = 0. 6.6.4. Expand 

,e x2/2 0 (x) in a series in powers of x (to this 
-end we must expand e~ t 2/2 in series in pow- 
ers of t and then integrate the obtained equa- 
tion termwise) ; prove that for an appropriate x 0 
the product e X2 / 2 (D (x) can be represented by 
the series (6.6.23) (where a 0 = 0). 

6.7.1. z = 2 st — (2st— 2 0 )exp [— to)] ; 

the water level grows, tending asymptotically 
to the value z = z s t. 6.7.2. Find the steady- 
state value z = z s t = constant (=q 0 /k) which 
satisfies the differential equation; then de- 
note z = z s t + % and form the differential 
•equation for the function z x — z x (t). 


Chapter 7 


7.1.1. x= *+ b -Vf + »-ab 7 i 2 Let 

6 

x be the side of the rectangle which belongs 
to the base (equal to a) of the triangle, then 

the area of the rectangle is S (x) — xh ^ 1 — 

x \ h 

— J = hx — x 2 . Solving the equation 

S' (x) = 0 we find £ = y ; then the adjacent 

h 

side of the rectangle equals y and the area 


S(x) = 


ah 


7.1.3. S = 2B 2 (the desired rec- 


tangle is a square). 7.1.4. The radius of the 

3 f— 

= y y- , the altitude h = 2 r. 


base 


7.1.5. /= 


av x -\-bv 2 
v\ + v\ 


7.1.7. The time of 


motion T = — Va?+x*+-\- / 6 2 +(c-*) 2 , 

where c is the magnitude of the projection 
of AB on the line l of interface, a and b are 
the distance of points A and B to l. The 

condition — 0 yields 


dx 


v x V a 2 + x 2 


c — x . x 

or since , = sm a, 


7; 2 Y b 2 + {c + x ) 2 


Y a 2 + * 


— — ■ — sin 6, where a and 6 are 

Y b 2 + (c — i r ) 2 

the angles formed by the lines of the body’s 
motion in the I and II media and the perpen- 
dicular to the line of the interface, we have 


sin a v x 

sin p ~~ v 2 * 

(This is the Snellius law , that is, the point 
must move in the same way as the light ray 
does intersecting the boundary between two 
media with different velocities in the media.) 
To prove that we have obtained the minimum 

J2T 

of T it is sufficient to find - — ; it is easy to 

dx 2 

d 2 T 

ascertain that — - >» 0 for all x. 
dx 2 

7.2.1. y ra i n = 3. 7.2.2. At the moment 
of time t the steamer S is represented by the 
line segment l whose endpoints are projected 
on the line, which corresponds to the bank 
where the fisherman F sits (we take it for the x 
axis) at the points with the abscissas x x = 
v (t — f 0 ), x» = v (t — t 0 ) — l. The distance 
from F to the point M of the steamer S is 
A (x) = Y x 2 + h 2 , where x x x <: x 2 \ we 
must find D — min A (x). For t 0 C t c t 0 -\-l/v 
this minimum equals h; it is attained at the 
point x = 0, which corresponds to the con- 
dition — — 0; for t <C t 0 and t >t 0 +l/u there 

dx 

are boundary minima Y (t — f 0 ) 2 + ^ 2 an ^ 
Y\v (t — t 0 ) — l] 2 + h 2 respectively. (This 
problem can be solved geometrically without 
resorting to differential calculus.) 7.2.3. (a) 
I/max = 0 at x — 0; (b) J/ m ax = 1 at x = 0. 

7.4.1. Prove that y" < 0. 


7.4.2. (a) ( 


T n i r n 
X\ x 2 

2 


1 [n 

> 


x 1 -j-x 2 

2 


at n> 1, 


■ Y ,m <— iifi. 


for 


, ... / + 

x x x 2 , (b) ^ — ^~2 — 

0 < m < 1, x 1 =£ x 2 ; (c) 1 -r- fe - + 

A-)]‘ /k > (^+fL) at k> 0 and x^x 2 

1 1 

(and x u ^ 2 >0); (d) y %i log x x -j- — x 2 log x 2 > 
1 

y (^i + ^ 2 ) log [(^ 1 + ^ 2 )/2] (in all these cases 

we only give an inequality, which is a par- 
ticular case of the inequality (7.3.1)). 

7.5.1. n/2. 7.5.2. 1/6. 7.5.3. 2ji+ 4/3 and 

37 

6ji — 4/3. 7.5.4. (a) a In 2; (b) -y a (here a 

is the amount of paint needed for a unit 
surface area). 7.5.5. 10 ji. 

7.7.1. F(a) + F(b)= j -f- + J-f- = 
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ab 


1 a 1 

2 

- _ i r 4 

7.8.1. y = x 2 — ~ 2 ~ \ x 2 dx = -^~. 


y(i) = i <y ~ 1.33 < 


7.8.2. (a )y =-^y( 0 )+~y(i)+ j y(2) = 

b 

y X 1+-^- X 4 = -|- ; (b) j = + ra 3 + 

a 

1 \ |6 1 ,1 , 

~ 2 px 2 + qxj\ -jr(6 3 -fl 3 )ryp(& 2 -fl 2 ) + 


1 1 

- 5 — 1 — — . 7.8.8. If JT is the period of the 

& Jl v 

function y, then we have sin[(o(£+r) + 

Ojr 

+a]=sin (cof+a), whence (dT=2n, T— — . The 

(0 

pjgj*0 m 

period of the function y 2 is — = ; con- 

2 a) 

y ( 0 ) + y ( 2 ) _ 2 sequently, we must find the mean value of 

the function y = sin 2 (cot + a) in the interval 

n We get 


= from t = 0 to t = - 


1 




F(R) + F(2R) 


= 0.625 F 0 > 0.5F 0 


4 °* 2 

(the error is of the order of 10 %); 

(b) F ( -- t, 2 * ) (4 R )=^ F °' there - 

1 


, 1 „ , 2 4 „ 

fore T F 0 ^ T -F 0 


* -109 p ^ 

6 X 4 F ° 216 F ° ~ 


0.505F 0 (the error is 1%). 7.8.5 
m — n 


72 + 1 * 

7.8.6. — — — — y . For t ?2 = 72 + v with v<72. 

In 772 — In 72 ^ ’ 

we have In m = In (72 + v)=ln 72 + ln^ 1+— j = 


In 72 + 


.... 7.8.7. (a) Both mean 


72 272 2 

1 1 1 

values are equal to y ; (b) — and 


Jt/O) 

1 


sin 2 (of + q) 
Jl/O 


9(6 -a) = (6 - a) [ + ( 6 * + ab + a*) + 

■l-p( 6 + a) + . According to Simpson’s 

b 

rule (7.8.3) we have ^ y dx =(b — a) y= 

(6-«z) [+(«)+ + (i) + + (6)]. 

Substituting the values y (a), y ^ — y~-) » 


3 i J [2 2 cos2 ( w * + a)J df — y . 

0 

1 

7.9.1. (a) s= ^ V^l + 4. x 2 dx; (b) s = 


b 2 x 2 


\dx . 


and y (6) and comparing the result with the 
above expression we see that they are identical. 
2 R 

„ n 0 - 1 f A 1 / A \ |2R 

7.8.3. F ~ 2R — R J r 2 dr ~ R ( r)L 

R 

R 

value of the force of gravity at this section 
is half as great as that at the earth’s sur- 
face, F~= 0.5/%. 7.8.4. (a)F (R) = F 0 J F(2R) = 


('S' + 1) ‘""‘Tir- The 4 


0 0 

7.9.2. On changing the variable we get 
Vl+ei 

s== 1 W~[dz. But j (l +++* = 

K 2 

H- J +r =z + I (-^T-^r) = 

= ( z + 
/l+^-l/"2 + 
142 — 1 


| 2 1 

z-jr-Tr In ■ 7 r 4 1 ( + £)• Therefore, 5 


In 


2+1 

2—1 \ \VT+72 

z + l ) 


J_ 1 + C 2 

2 11 1/1++2+1 


1/1 

1 


-yin 


.7.9.3. 


/ 2 + 1 

Partitioning the arc of the catenary curve 
from x =0 to £=0.9 and from £—0.9 to £—2 we 
e 2 + e ~ 2 


find s x 

2 

_1_ 

2 


1.043, s 2 - 
2 


e 0.9 + e -0.9 


-x dx . In the last integral we 
__ _ 2 e x dx 

J (e 

1 


0.9 

can set e x = t, then f — - — — ( 

’ J **—<?-* J (**)2_l 

f 2dt f / 1 l \ 

J-^ZT- j 7+rj^* Finally we 

obtain s 2 ~ 2.624 and 5 = Sl +s 2 ~ 3.667. From 
the exact formula we get s— 3.627. The error 
is approximately l<y Q . 7 . 9 . 4 . 5 ~ 1.146. 

7.9.5. The length of the circle’s arc being 
considered 


3 5 — 0946 
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35 




r 18 432 

Accordingly, we obtain the following esti- 
mates of the number Jt: (1) Jt ^ ,/• = -(i+ 


12 


.±) 

160/ 


: 3.117; (2) jt~- 


iIo+Jb) - 3 - 133; < 3 > 31 - 


V* 

4 

V2 


V2 

t 1 


1 




( 1+ ^ 


12 

1_ 

12 “ 


35 


160 ' 896 

7.10.1. 


(b) * = - 


r 18 432 
(a) k =- 
2*3 


3.138. 


ab 


)/' ( a 2 sin 2 t-\-b 2 cos 2 t) z 


(c) = 


V ( i +* 4 ) 3 

12a* 2 

y'(l-f 16a 2 *^ 


7.11.2. F=2it. 7.11.3. x c = -R. 

JX 

7.12.1. (a) 2/ max = 2 at * = 0; */ m in = —2 
at * = 2; (b) there are no maxima or minima; 
the curve intersects the * axis at the point 
a:=l-|-|/14^34 and the y axis at point 
y = — 15; (c) there are no maxima or minima; 
the curve intersects the * axis between the 
points * = 0 and * = —1 and the y axis at 
the point y = 3. 7.12.2. (a) There are three 
roots; (b) three roots; (c) two roots; (d) one 
root. 


Part 2 
Chapter 8 

8.1.1. T ~ 1600 years. 8.1.2. 176.5 g. 
8.1.3. 53.3 g. 8.1.4. We know that N (t) — 
N where N 0 is the quantity of substance 
at the initial time t = 0. We are interested in 
the time t { by which there will be (100 — 
1) %= 99% untransformed substance: N (t ± )= 

—N 0 . Therefore, = N^e-tR whence 

100 0 100 

t x = f In For radium t = 2400 years and 

therefore t ± = 2400 In -!~ = 24 (years). Sim- 
ilarly for the three remaining cases we find 
t 2 ~ 250 years; t s ~ 5500 years; ~ 11 000 


years. 8.1.5. Suppose that at the initial time 
t — 0 the number of radium atoms in 10 12 

atoms of rock is N 0 ; at the time t = 10 000 

_ 10 0000 

it equals 1. Therefore we have 1 = N 0 e 2 400 T 
whence N 0 ~ e 4 ~ 65. Similarly, we find 
that 10 6 years ago N 0 ~ e 417 ~ 10 181 . Clearly, 
the result is preposterious: 10 12 atoms of rock 
contain 10 181 radium atoms! No less absurd 
is the result of calculations of the amount 
of radium 5 X 10 9 years ago. All this proves 
that the initial assumption is incorrect on that 
the present amount of radium can be regarded 
as the result of disintegration of the original 
supply of radium (when the earth originated) 
(cf. Section 8.4). 


Chapter 9 


t 


9.1.1. A(t)= — h j v 2 dt; A is negative 

to 

t 

since J v 2 dt > 0 for t > t 0 . 9.1.2. The motion 
to 

of the body is periodic with the period 
2n 

T ; we have to find the work for a half 

CD 

period. During the first quarter of the period 
the velocity is positive, and therefore F= — h; 
during the second quarter of the period the 
velocity is negative and F = h. During these 
time intervals the force is positive, and the 
work equals the product of the force by 
the distance covered by the body; for the 
first and second quarters of the period we 
have A 1 = — hB and A 2 — h ( — B) = — hB; the 
work during the half period is A = + A 0 = 

T 

— 2 hB. 9.1.3. A = Bf 0 (a 1 ^ sin co 0 t cos co^ dt. 
0 

This integral can be easily evaluated if we 
recall that sin co 0 * cos 0^ = — [sin (co 0 +coi)£+ 


T 

sin (co 0 — coi) t]. Therefore A = — y- 5 - j X 

0 

[sin (co 0 + (Di) t + sin (co 0 — «x) t] dt = 

B /pCOj r 1 , 1 costcoo+cox) T 

2 LcOoH-COx - 1 " C0 0 — (Ox co 0 (Ox 

cos(co 0 — ©x)r 1 , 

7- — . In the case where cox = co ft 

«o — Wi J 10 

we cannot use the last formula. In this 
case, however, sin co 0 £ cos cox t = — sin 2co 0 t, 
T 

whenc e4=^y^ j sin 2co 0 * dt =—jr~ (1 — 
0 

cos (o 0 T). 9.1.4. The work of the air resis- 
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tance is Ay(t) = aS( ^ g t 4 . The work of 

the force of gravity is A 2 (t)= — j- t 2 . For 

a ball: Ay (1)= -0.00965, Ay (10)= —96.5, 
Ay (100) = - 965 X 10 s , A 2 (1) = 0.177, A t (10) = 
1.77, A 2 (100) = 177. For a bullet: 4 t (l) = 
-1.18X10-®, Ay (10) = —11.8, Ay (100) = 
— 118 X 10 3 ; A 2 {i) = 0.435, 4* (10) = 43.5, 
A 2 (100) = 4350 (the dimensions of A are in 

joules — J). 9.1.5. A= ^p-v^ b _ We 

find the power W using the formula W = Fv, 

W = ^ - v - . Let us find the velocity 

v (for a given y 0 ) at which the power is the 
dW 

greatest, i.e. when — - = 0; we get z;i = y 0 

and v 2 = v 0 /3. Clearly, we are not interested 
in the value v — v Q at which the power is 
zero. Of interest here is the value v = v 0 /3 
(the reader can make a complete analysis 
based on taking into account the sign of 
d 2 W/dv 2 ). For y 0 = 30 m/s and i> = 10 m/s we 
have W max ~ 25.75 X 10 5 W. 9.1.6. The work 
of the force for one period is A = Bnf sin a] 

the power is W = — ^ — sin a. 


9.3.1. We take as the x axis the straight 
line on which the charges are located; let 
the charge e x coincide with the origin and 
the charge e 2 be at the point x = 2a. The equi- 
librium of the charge at x is only possible 

0. But * = if 


ifF=-^ 


dx 


x* 


0 < x < 2a, if the charge e lies 

„ . „ e i e 4eye 

e 2 , r = 


and 


F = 


e t e 


x 2 


{2a — x) 2 ’ 
between e x 

if i<0, and 


case 
x 2 = ■ 
since 


Z 2 

the 

~2a 

we 


(2a — x) 2 
if x > 2a. In the first 


1 (2a — x) 2 

condition F — 0 yields £ 1 = 2a/3, 
(the second root x 2 is neglected 
must have x >• 0); in the cases 
where £<0 and x>2 a the equation F = 0 
has no solution at all. Consequently, there 
is one position of equilibrium x x = 2a/3. Then 
d 2 u 

we calculate at the point x = x 1 = 2a/3, 


/72 1/ 

we find that if e>0, then ■ -- 0 > 0, i.e. the 

dx 2 

equilibrium is stable, but if e<0, the equi- 
librium is unstable. 9.3.2. There is one posi- 
tion of equilibrium x x outside the charges. 
If the coordinate system is chosen in the 
same way as that in the solution to Exer- 
cise 9.3.1, then^!= — 2a. If ee 1 > 0, the equi- 
librium is stable, but if ee x < 0, the equilib- 
rium is unstable. 


9.4.1. The equation of motion is m— — F; 
using the fact that v = 0 at £ = 0, we find 


F dx F F 

v = — t. Therefore — •= — £; dx — — tdt % 
m dt m m 


whence 


] dx= i \ 


tdt, since x = Oat £ = 0. 


Finally x — - 


2m 


t 2 . 9.4.2. (a) x = v 0 t + ^ t 2 ; 


(b) x = Xq-\~ u 0 t -j- t 2 . 


9.4.3. 2.5 m. 
dv 


9.4.4. The equation of motion is m—^ = mg, 


whence we find x = 


gt 2 


2x 


2 ’ f = V 

4.5 s for £ = 100 m. 9.4.5. (a) £ = 3.6 s; 
(b) £ = 5.6 s. Further, for the case (a) v = 
gt -f~ vq. Let v\ be the velocity at the moment 
of impact; then v\ = gti-\-v 0 , with £i the 
landing time. From the equation v = gt+v 0 
gt 2 


we find x = v 0 t- 


Let the ball 


fall from the height H; then x — H at £=£l 
therefore 2H— 2v 0 t i -\-gt\, whence t\ — 

— v 0 + Vv$ + 2 gH 

and v\ = gh + v 0 = 

Y v% + 2gH . In the case (b) v = gt—v 0 and 
the remaining is the same; the final velocity 
obtained is the same as in (a). 
k 

9.4.6. + v 0 t. 9.4.7. (a) x = 

2n 


f 


COS G)£, 

/ 

771(0 


T = 


(0 

f 


(b) x— — - — £ 

777(0 


/ 


/ 

772 (O 2 ’ 

- sin w£. 


9.4.8. Let the desired velocity be u 0 . Then 


the law of body’s motion is x = 


-v 0 {t- 


£o)+ tj — (t — £ 0 ) 2 . Since x — x ± at £ = t 1 , we 


2m 

have x x = 
whence v n — - 


v o (ti ^o) + 

_ ^ 7 ? 

(*i“ 


2m 


*o)* 


£i £ 0 

F2 

9.8.1. K = —-t 2 = F(x-x 0 ). 9.8.2. = 


/ 2 


2m(o 2 

mA 2 (n 2 


2m 

sin 2 at; K max = 


f 2 


2t77(0 2 

777^4 2 (0 2 


9.8.3. K = 

2 sin 2 (g)£ + a) = " VA . 9.8.5. (a) A ~ 

4 X 10 7 J, W ~ 2.2 x 10 5 W; (b) A ~ 7 X 10 7 J, 
W ~ 38 X 10 4 W. 9.8.6. We find the work 
of each separate force. First we determine 
the velocity of the body. From the equation 

777 = at-\- a (0 — £) = a0 we find v = v 0 -}- 

aQ 

— t. The work of force F i is 4i = 

777 1 

e 


j at ( v o + ~ t) dt = 


av 0 d 2 . a 2 0 4 

2 h ^T 


. Simi- 


35 * 
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larly, the work of force F 2 is A 2 = 
a 2 0 4 


av 0 Q 2 


6m 
impulse 

e 

\ at dt — - 


We now obtain the product of the 


by 

a 0 2 


the 


average 

e 

2= j « (0 ' 


velocity / i = 
a 0 2 

- t) dt — - — ^ — , 


j ( v ° + J £r t ) dt 

v=— 0 = v ° + 2 m Q2 ' and 

a0 2 / , a \ - 

therefore /ii> — “ ^ ( y o~r ’ ^ 2 ^ = 

( y o+"^T 02 ) • We can see that ( 7 i + 

j 2 )^=A 1 + A 2 though hv=£A u I 2 v =£ A 2 
(cf. p. 326). 9.8.7. At the beginning of the 
experiment the mass m had a velocity v 0 
(it was moving together with the train) and 
m Vq 

akinetic energy K x — . After the ac- 

tion [of the man the velocity of the mass 

„ L , rr m ( V 0+ V l) 2 • . i 

became v 0 + v u K 2 — — — , with v 1 = 

Ft 

The change in the kinetic energy 

m 

A = »£«±a£-!El. This is 


the work performed on the mass by the train 
and the man together. In order to find the 
work performed on the mass by the man, 
note that the velocity of the mass with 
respect to the man moving in the same 
train was zero before the experiment and 
became after the experiment. Therefore 

the work performed by the man is A 1 = ^~ — 


muf t*t* Ti • . a , 

n — — . It is easy now to find the 

u 2 2/7i 

work:. Ao A of_ the locomotive Ao = AK— A-,= 

mv 0 v 1 = v 0 Ft.^The last result can also be 

obtained in a different way. Indeed, the loco- 
t 

motive’s work is A 2 = j vF dt, since the ve- 

o 

locity v = v 0 is constant, we have A 2 — 
t 

i / 0 j F dt = u 0 Ft. j 9.8.8. Before the experi- 

o 

ment the velocity of mass m and that of the 
man are zero. After the experiment the mass 
m acquired the velocity v x and the man v 2 . 
We find these velocities from the equations 

m ■ ^ -=F and M - V . 2 •— — F , since if the 
dt dt 

man acts on the mass m with the force F, 


the mass m acts on the man with the force 
—F (Newton’s third law of motion). We find 

v i= , v 2 = — The work performed 

by the force F on the mass 

m and the man is A~Kx~\-K 2 , where 
mu } FH 2 


K i = 


2m 


is the change in the kine- 


FH 2 


tic energy of the mass, R 2 = -~ __ 

• , , . 2 2 M. 

is the change in the kinetic energy of the 

man. Whence A = . 9 . 8 . 9 . The 

change in the kinetic energy of mass m is 


A K m = mu 0 u 1 + 


, where v x = t. The 


2 m 

change in the kinetic energy of the man is 
Mv\ . . p 

+ MvqU 2 , where y 2 = — * 

M 


A K m = 


2 

. F 2 t 2 (MA-m) 

A = — 9 - 8 - 10 - w If t\=h+b 


f“ d T Js = i a + &. then T' = t' — t[=t 2 - ti=i. 
(b) It x l —x 1 -\-vt 1 -{-a and x 2 — x 2 -\- vt 2 -\-a, 
with t \ == t 2l then d'=x 2 — x\ — x 2 — x x = d. 

9.9.1. For cp = 30° we have z 2 = 565 m, 
^max = 81.5 m; for q> = 45° we have x 2 = 
650 m, y ra ax = 163 m; for q) = 60°, we have 
*2 = 565 m, y m ax — 244 m. 9.9.2. The equation 


of the trajectory is y = x tan w—x 2 - _ g - — 

2 z/gcos 2 cp 

At a given x = 500 m we seek 9 for which 
2/max > that is, we solve the equation 


-=q. it yiems tan q>- 


^9 ~ " 1 gx 1 COS 2 CD 

Vq — g 2 X 2 

— - 5 - 3 — . Using this fact and the equation 

6 x 

of the trajectory we find y max = -^._|^. 

Setting v 0 = 80 m/s, x = 500 m we determine 
y max — 135 m. 


9.10.1. r= 30 000 km. 

9.11.1. EincL t,he_ secnmL df\rj vatiya. of, 

dz 2 

the function y = F(z) (compare it with what 
is said in pp. 337 and 216, note that 
F (z) -> 0 as z -> 0 and z -> 00 ). 

9.12.1. Setting the origin of coordinates 
at the center of gravity of the rod we get 

a 2 u 2 

Iq— ^ x 2 odx=o ^ x 2 dx = o-j 2> since 
-if 2 ^ - if 2 

m — ol , the last "result can be written as 
1 2 

I 0 =m . 9.12.2. Set the origin at the 

joint of the pieces having different densities 
so that for the first piece it is Oj at £<0 
and for the second piece it equals o 2 at 

x>0. Then x c = ^~ ° 212 ' ° lli . 9.12.3. On 

2 h + 

choosing the coordinate system as indicated 
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in the hint we obtain x c = -^L] the moment 
of inertia with respect to the origin is 

L 

f aL 4 aL 2 

I — j x 2 a (x) dx — — ^ — . Since m = — , 

o 

2m mL 2 

a=-jY~ \ therefore 1= — q — 


L 2 

(9.12.13) /o = /- 
whence 


mL \ , with 


f 




y(0) = i? 0 we find £ = 

" ' "o — — t" • ( In order to evaluate the 
— v 2 \ 

integral, it is sufficient to write the integ- 
rand in the form — — ^ — r-— — - 1 — 

v\ — v* Vi — v v i-f-y 
and find a and b by the method of undeter- 
mined coefficients; see exercises to Section 5.3.) 

vi + v 


Thus 
where A = 


we have In 
u i +^o 


Vi — v 

_ Vi + V 


In A -f-2 $vit, 
-.AeW vlt , 


For very large t we 


Va—V^ 

p- 1< + i 

have e 2 $ Vlt > 1 and therefore v~v 2 . We 
write the solution to the equation in the 
e -2$vit 

form u = v i — — — _2 . or u — v i 


A -j-e' 


- 2fivit 


2Vl A + e~ W* 
can neglect e ~ 2 $ vlt in the denominator since 
it is small compared to A; therefore in this 


case we have v — Vi ~ — 2y x 


a -2$vit 

~A 


U — Vx~' 


2 vi {Vq—Vi) _ 


^1+^0 


2&vi t. 


In view of 

£,-4 L; 


0 18 * 

9.13.1. (a) Compare with the hint to Exer- 
cise 9.12.3. Answer. The center of gravity be- 
longs to the median of the triangle and is 
2/3 of the distance from the vertex along the 
median (it coincides with the median point), 
(b) Let a and b be the lengths of the bases of 
the trapezoid, with a> b, the center of grav- 
ity lies on the straight line connecting the 
middle points of the bases at the distance 

-Lfo - g ..., ~b from the lower base, with h the alti- 
a+b 

tude of the trapezoid, (c) The center of gravity 
lies on the symmetry axis of the half-disk at 

the distance — R from the diameter which 
3 ji 

bounds the half-disk, with R the radius of 
the half-disk. 

9.14.1. We write the equation thus -— = 

dv 

1 

* - q — / " 0 Sr , whence accounting for the ini- 

13 ( v\ — v 2 ) 
tial condition 


; comparing this 
with (9.14.24) we find C = - -- -- fe - ?■ — — . 

Vi-\-v Q 

9.14.2. In the case where the resistance is 
proportional to the velocity the equation of 
dv Jc A 

motion has the form — — = g v , 

dt m m 

where A is the buoyancy. In this case the 

a ji 

velocity v-l = — sets in. By the 

a am 

Archimedean law A = Vp'g, where V is the 
volume of the body and p' = Pi is the density 
of the liquid. Since m = Fp (p = pb is the densi- 
ty of the body), we have A = and vi = 

(l j . For p'>p the body floats up 

(fxCO), while for p' < p it sinVs (v 1 > 0). 


Chapter 10 

10.1.2. (a) The body stops at x 0 = 1/4. 
(b) The stopping point is x 0 ~ 0.9. (c) There 
are no points of stop. 10.1.3. The position of 
maximum of u (x) is u max ~ 9.5 at x = 8/3. 
Noting that the left-hand branch of the graph 
rises and the right-hand branch descends we 
can construct a rough graph of u (x), which is, 
however, sufficient for solving the problem, 
(a) From the fact that the sum of the kinetic 
and potential energies is constant it follows 
that two stopping points (at which the kinetic 
energy is zero) are the values of the least roots 
of the equation — x 3 4~ 4a: 2 = 6. This equa- 
tion can be solved either graphically or by a 
numerical method. We get x 1 ~ — 1.09, x 2 ~ 
1.57. The body oscillates between the points x t 
and x 2 . (b) One stopping point is x ~ —2.04. 
However, this point is to the left of the point 
from which the body starts at the initial mo- 
ment. Since the initial velocity is directed to 
the right, the body will move to the right 
without reaching the point of stop, (c) One 
stopping point is x = — 2.04. In this case 
the body will move to the left of the stopp- 
ing point and then travel to the right. 
10.1.4. There are no points of stop. The body 
will 'travel ‘to 'fne right, (fij "T’nere are two 
stopp ing points: x 1 = 4~ 9/11 , x 2 = 
— |/^9/ll . The motion of the body is the oscil- 
lation between the points x Y and x 2 . In the 


__ case (a) t 


-,# + ) V 


1-f X 2 
4 -f 3z 2 


dx. In the case 


For rather large t we (b) t = 


: *°+j 1 

0.5 


20 4- 20a: 2 
9 — 11a: 2 


dx at x < x 1 


or 


and t = t 1 


f / 204 - 20 a : 2 

J V 9 — 11 a : 2 


dx at x 1 < x < 




550 


Hints, Answers , and Solutions 


< x 2 , where t 1 is the time when body reached 
x ± . (Note that the integrals remain finite 
though at the stop point the integrand is 
infinitely large.) 

10 . 2 . 1 . (a) x = 2 sin t, (b);r = cosZ, (c) x = 

cos t + 2 sin t. This solution can be written 
in the form x = C cos (t + a), where C = l/"5, 
a = arctan ( — 2) ~ — 1.11, i.e. x — 

cos (t — 1.11). In all these cases T = 2n. 

10.3.1. (a) Let L be the length of the pen- 

i /~ m gl 

dulum. We know that co = y — — , and in 
7 2 r _ mL 2 , 

our case l = -^L,I = — - — (see Exercise 

9.12.3). Therefore G)= j/~ 

2n , / 3L , /IT 

T =—= 2n V ^f- 5A3 V T* The 

value of l corresponding to the maximum fre- 
quency is determined from the formula 


1 r 4 

obtain A=— — \ (1 — x 2 ) dx = -^- » whence 


dC 

dt 

then 


1 

hC 2 (P* X 4 
3nk 


Denote 


4/kd 3 

3nk 


= b; 


1 


bC 2 


— bC 2 and consequently = 
dt dt 

The solution to this equation is 


_1 
b 


whence C 


l 


_i /J± 

l ~V m ■ 


Since in our case 7 0 


max - 

mL 2 , L 

we have Z max = , therefore 


CCq ’ 1 + ’ 

here C 0 is the value of the amplitude at 
the initial time t= 0 which can be deter- 
mined from the initial conditions. (Note that 
the same law we have obtained for the velo- 
city of the rectilinear motion in the case 
of resistance proportional to the squared 
velocity (see (9.14.15)).) 10.4.2. In this case 
the work for a quarter of the period is 
— fC, therefore the average power is 

-fC-^r — — fC . We get the equation 

dC 2/(70) , dC 

d t 


18 

*°max = V 

T min 


V 18 


A - 1^18 


kC ^ ~ 

dt n 

therefore C = C 0 - 


2/o) 

kn 


■■ — and 
kn 


g 


2 L 


Knowing 0 ) max we find 

\2n L 

. (b) Here][Z = — , / = 7 0 + 


t. The oscillation s 
will cease at time t x when C — 0, that is, 


when Zi = - 


2/o) 


(we suppose that t ± > T). 


ml 2 — 


EL^max 

mL 2 , mL 2 


mL 2 


18 1 9 — 6 

The minimum period coincides here with 
that obtained in the case (a), but the point 
of suspension is different (however, Z max is 
as before). 

10.4.1. Let the oscillations be given 
by the formula x=Ccos(a>t + a), then 
v = — C(Dsin((Dt + a). We use the rela- 
dC 

tion kC — r — =Fiu , where F x = — hv I v I. 
at 

F ± v= — hv 2 \v \ — — hC s o) 3 | sin 3 (o)Z + a) | , 
and therefore the law of the change of the 

amplitude C can be written as kC — = 

dt 

— hC^afiA, where A = | sin 3 (coZ-f-a) | . 
Note that sin(o)Z + a) retains its sign when t 

varies from t, = to t 2 , with 

0) 0) 
t2 


10.5.1. From the first equation of the sys- 
/- ~ 2 ~ tern we get C 0 cos q) = a: 0 — a. Using this 
o)= y — — . expression we readily obtain from the second 




Xq — a 


equation C 0 sinq>=& — 

^ U COi (0i 0)i 

Squaring these two equations and adding the 
result yields C% =( b — — y — — + 

\ 0)] (Oi COi / 

(x 0 — a) 2 ; taking square roots of both sides 
of the equation we find C 0 and then find 
co v 0 x 0 — a 

x & cox coi y o)i 

tan cp — . 

x 0 — a 

10.7.2. Beats (cf. equation (10.7.5)). 

10.8.1. Use the fact that sin 3 ;r = 
sin x(i— cos 2x) _ sin x 

2 “ 2 
sin x cos 2x sin x sin 3x — sin x 


sm x sir x — - 


t 2 — ti — ~2 . Therefore A = 


- f 

n J 


n-a 

co 


j sin 3 (<£>t-{-a,) dt 

t_L _ 

t 2 1 1 


sin 3 (o)Z + &) dt. Changing in the 


3 1 

-£-sin x — sin 3x (representation of f (x) in 

the form (10.8.15)). Answer. y(x, Z) = 

3 1 

~r cos t sin x r-cosSZsinSa:. 10.8.2. We are 

4 4 

forced to abandon the condition D = 0 so 
that instead of (10.8.14) we get y {x, t) = 
knx kcnt , . knx 


2 b k * 


l 


l 


d b sin 


l 


X 


kcnt 


last integral the variable cos (coZ + a) = x we 


. 10.8.3. Conditions (10.8.2b) yield 
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<p (x) ^ (x) = f (x); q>' (x) — o|)' (x) = 0, whence 

y{x , f)=- 2 - [/ (a:-(-ci)+/ (a: — ci)] , where we 

^nnlsidi' the_ interval 
<0, l) assuming that the function is odd and 
periodic with period 21. 

10.9.3. (a) y-=-^ 4 (cos x C0 ^ X - + 

cos 3a: , \ ( Ci — C 2 \ 

— p r (b) y = [ 1 j n — 

oo 

2 (Ci — C 2 ) sr\ cos [(2k — 1) nx/l] , 


2k — 1 


sin ( nrcx/l ) 


•2 


(C!+C 2 )2 (-1)" 

n=l 

Chapter 11 


11.3.1. p=1.13p 0 ; p = 1.48p 0 ; P=3.67p 0 , 
where p 0 is the air pressure at the earth’s 
surface. 11.3.2. The dependence of the pres- 

-gh/b 


--p 0 e 


sure on altitude is of the form p 
RT 

where 6 = 1 0 3 — . For the temperature 

— 40 °C we have T = 273 — 40 = 233 K, 
8.3 X 10 3 X 233 

6 on / — 6.6 X 10 4 cm 2 /s 2 ; in 


Set 10 3 


RTp 

M 


= *o, 
P 


whence p=-, .. 

r b 0 (l — ah) 

tion for p assumes the form-^- — — g 
jlp — gdh 


or = ■ 


In p = "ft a In (1 — ah) + C , whence p ~ 

A(i — ah) §/b ° a , with A = e c . Since p = p 0 at 
h — 0, A = p 0 \ therefore p = p 0 (1 — ah) g / bo(X . 
11.3.4. p = p 0 (1 — 0.037x 10 ~ 5 6) 3 - 46 ; 

1.13p 0 ; p = 1.44p 0 ; p=-2.97p 0 . 


b o (1 — a6) ’ 


RC In -Jq- ~ O.IOS/?^. In veiw of 


Chapter 13 

13.2.1. The current decreases according 
t,o_ the Jaw. f== .Atlb£Jim£^ ljL7 which„ 

9 

interests us, we have/--jQ- ; 0 ; therefore 

^r/o = /V _<l/RC . whence e ~ tl,RC . 

9 

Taking logarithms yields ln-^Q- = — 

, 9 

whence £ x = 

the last formula we have ^ 1 s for 
i? = 10 7 Q; *i~10.5 s for R = 10 8 Q; ^ ~105 s for 
R = 10 9 Q. The time t 2 , when the current 
decreases by a factor of 2, is determined in 
a similar way: 0.5j o = j o e~ t2 / RC 1 whence * 2 — 
0.6937? £7. Here * 2 ~6.93 s for i? = 10 7 Q; 
£ 2 ^69.3 s for R — 10 8 Q; * 2 ~ 693 s for 
7? = 10 9 Q. 13.2.2. We use formula (13.1.11) 
and find (P Ci + Tr + (P^ 2 = 0, whence q) R = 
— (Tc 1 + ( Pc 2 )- The curr ent in the circuit is 

d(p Cl 

the same everywhere; therefore / = C i — 

„ d( fc 2 <Pr_ ( Pci + ( Pc 2 

C 2 — — ~~R — R * We obtain the 

*Pc 2 


29.4 

b 

this case 77 = — = 6.6 km. For the tempera- 
ture 40 °C we have T=313 K, 6 = 8. 8X 
10 4 cm 2 /s 2 , 77 = 8.8 km. 11.3.3. From the 

dT 

equation ~^p~ = ~ aT o it follows that T = 

— aT 0 h-{-C, where the constant C is deter- 
mined from the condition T = T 0 at 6 = 0, 
whence C = T 0 ] therefore T = T 0 (l — ah). 
The basic equation for determining the den- 
dp 

sity is ~[h~— — £P- We use Clapeyron’s equa- 
RT 

tion p = 10 3 p -jy- ■ and substitute the ex- 

£ m RTAi — och) 

pression for T to find p = 10 3 p — . 


dt 

equations 
1 


R 
d( P C, 


RC 2 

dt 

r C \C 2 


dt RCi ('Pc. + Vc.)* dt 

(TCi + T^)’ adding them yields 


1 + 

' RC 


dt 


where we set 


/-t , ^ (C is the total capacitance of 

oii ~C 2 

two capacitors connected in series, C i and 
C 2 ). Since q> Cj + Tc 2 = a at * = * ast 

equation yields ^ Cx J r^> C2 — ae~ tlRC . Clearly, 


d<? Ct n d ^c 2 


= 0 , 


or 




then p = p6 0 (1 — och), 
The differential equa- 

dp p 


dh " s 6 0 (1 — ah) * 
integrating we get 


dt z dt 
C 2 cp C2 ) = 0. Therefore ^iTci - ' = 

where i is a constant. Using the initial 
conditions cp Cx = a, q) c =0 at ^ = 0, we find 

A = Cia. Thus, q? c +q> c =ae ~ t/RC , C x q) c — 

r< 

C 2 ^>c 2 = C 1 a. Whence ( Pc t = a ~^~Zfc~ 
-t/Rc\ c i 


X 


'l+4±e 


Ci - r Vc * a ~ci+c 2 

e -t/RCy i3 # 2 # 3 # We label all the quantities 
belonging to the circuit before all linear 
dimensions are increased with the index 1 
and after the increase with the index 2. Then 

zS 2 &n 2 S 1 _ 


■(-1 + 


-RiCi 


p nCi, R 2 = p 


R 2 C 2 , C 2 - 
nl l _ 


4ji d 2 
Ri 


n 2 Oi 


Andxn 

Therefore 
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r 2 = — - n c 1 = R 1 C 1 = T 1 . The time constant 
has not varied. 

13.8.1. Let the potential difference be cp. 
For the circuit shown in Figure 13.8.3 we 

have q>E+q>r + <Pc = 0, or Z + q> = £„ 

d(p d 2 i p , 

(since (p E = — E 0 ). As j = C , LC ~jp-+ 

<p = E 0 , i.e. LC — (CP - E o) or LC X 

d2 z 

_ 2 with 2 = cp — E 0 . The solution to 

the last equation has the form z = 

Z? cos (G)*-(-a), with (0=1 l\ r LC', therefore 
cp = Z? cos (a)*4-a)+#o- But cp = 0, y = 0 at 
* = 0; using this fact we get B~ — E Q , a = 0. 
Finally we have cp = £' 0 (l — cosco*). The value 
Tmax = 2Z£ 0 corresponds to the equation 
cos(D*=— 1 , i.e. t = n/(d = T/2; it is attained 
at half period of oscillations. 13.8.2. The 
energy of capacitance W— C'cp 2 /2 = 46'£’§/2= 
2 CEl\ the energy released by the voltage 
source P = qE 0 = CyE 0 = 2CEI. 

13.9.1. f(t)= e~’ Kt sin o)*, with 


X — —- , o ) 2 = — ; — X 2 . For the given three 
ZL LC 

cases we have j (*) = — 1.0025e -0 * 05i sin t; 
j (t)= — 1.031*- 0 - 25 * sin 0.97*; j (*) = 

— lA5e-°- 5t sin 0.87*. 13.9.2. j (*) = 

/o |cos co* — — sin G)*j e~^ t \ formulas for X 

and (o see in Exercise 13.9.1. For the given 
cases we have / (*)=e-°- 05 * ( — cos t + 0.05 sin t); 
j (t) = e~°- 2bt ( — cos 0.97* + 0.26 sin 0.97*); 
j ( t ) = e-°- bt ( — cos 0.86* + 0.58 sin 0.86*). 
13.9.3. If H is very great, then the current 
flowing through the resistance is small, i.e. 
the current flows mainly through the induct- 
ance. Therefore the greater the R the more 
close this circuit is to that shown in Figure 
13.8.1, where cp = cp 0 cos ((D*-j-a). If R is 
great, we can assume that cp in the circuit 
of Figure 13.9.2 has the same form, while 

dP 

cp 0 varies slowly with time. Here -jj'— — A; 

but h = Rj 2 lJ where j 1 is the current flowing 

through the resistance i?, / 1 = -2- ; therefore 

R 


R 


cpg cos 2 (co* + a) 
1 R 


and h = . Thus, 

ZR 


S' - Noting that P- 


2 


= d(p0 - 
dt dt ~ 


2 R 


Whence 


, we find 

dq> o _ 
dt 


IRC 


cp 0 . Therefore (p 0 = Ae 


2 RC 


; X= 


2 RC' 


13.10.1. cp (t) = e~t + te~ l \ cp (*) = 

— 0.03e -5 - 83f +1.03c“ 0 - 17 *; q>(*)= — 0Me~*- Qt + 
l.Ole- 0 - 1 *. 13.10.2. y(t)=e-t + 2le~t\ cp (*) = 

— 0.37e-3- 7 3* + 1.37e-°- 27 L 


Part 3 


Chapter 14 


u s 


14, 1.1. (a) u = x s — 3 xy 2 — 3x-f-l, u—3 x 2 y- 
x 3 -\-x-\-xy 2 


3 y; (b) u = 


y- 


-y* 


( 1 + z 2 — y 2 ) 2 + Ax 2 y 2 


(l-J-a: 2 — y 2 ) 2 -\-Ax 2 y 2 ’ 
(c) u = 


( 2 ) x n ~ 2 y 2 + ( ^ ) * n-4 i/ 4 + ■ • • » v = ( \ ) x 

x n ~ y y — ^ g j x n ~ s y 3 - f- ^ ^ j x n ~ h y b — ..., where 


l ? ) — - ‘ ----- is the number of combina- 

\ k / k\ (n — k)\ 

tions of n elements taken k at a time. 14.1.2. 
Formulas (14.1.6a) and (14.1.6b) can be proved 
proceeding from the equations z~ 1 z = i and 
(zi/3)3 = Z ; the general formula (14.1.6) follows 
from the fact that it is valid for any rational 
72 = p/^ = (4/^) p, where the cases of p being 
positive or negative must be considered sepa- 
rately. 14.1.3. Use the fact that if c — 
r (cos a -f- * sin a) then y r c = p (cos p + * sin p), 

with p = y /r r, and P = P 0 H — » with p 0 = 


, & = 0, 1, 2, ..., n — 1. 14.1.4. In accord- 
ance with the results of Exercise 14.1.3 we 
have y — 1 = cos P+ i sin p with p = Ji/4, 
3ox/4, 5ax/4 ( = — 3jx/4), 7ji/4 (= — Jt/4); note 
also that in all cases cos p and sin p equal 
±1^2/2. 

14.3.1. If lnii=w, then u = e w and u z — 
e wz \ use this fact. 14.3.2. Clearly e e ~ 
(2.7) 2 * 7 > 1 > i l \ while numbers e i ( = cosl + 
isinl, where the angles are measured in ra- 
dians) and i e ( = e e ln l = e( eJt/2 ) ? ) cannot be 
compared (we can compare in magnitude only 
real numbers and not arbitrary complex 
numbers). 

14.4.3. (a) The circle; (b) the hyperbola. 
We have for (a) and (b) * = tan-^- and t = 
\b 

tanh — respectively if at the same time the 

curves are also given by the “standard” 
parametric equations; (a) z = cos(p, y = sincp; 
(b) x — cosh ?/ = sinh\j?. 


Chapter 15 

15.1.1. All derivatives have the form 
P ^ — j e” 1 /* 2 , where P {z) is the polynomial 
(in variable z ); at z = 0 they are equal to zero 
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due to a high rate of decrease of the function 
y = e~ i ^ x2 as x-*-0 (cf. Section 6.5). For 
example, /' (x) = 2/x 3 e ~ i / x2 but e l , with 
3 t= — 1/x 2 , decreases more rapidly as t — oo 
than the function 2 | t | 3 / 2 (— | 2/x 3 |) in- 
creases as t —5 — oo. 

15.2.1. If w— lnz, then u = In \ r x 2 -\~y 2 

and i; = arctan~ , whence -4~ — *4“ =r - 

x dx dy 

= = . 15.2.2. Use 

x 2 ~ry 2 dy dx x 2 -\ -y 2 

the fact that , v ± — . 

dx dx 


Chapter 16 


16.2.1. Near the point x = x 0 we expand 
the function cp (x) in a series retaining only 
two first terms cp (x) ~ cp (£ 0 )-f- (p' (x 0 ) (x — x 0 )~ 
cp' (x 0 ) (x — x 0 ). Thus denoting cp' (£ 0 )=c, 
x — x 0 = y, we get 6 (cp (x)) = 6 (cy) = 

T7T 6(y) ^ In a more 

general case the function 6 (op (z)) is the sum 
of similar expressions valid for the zero’s 
of \f>. 16.2.2. The function (p(^) = sina: van- 
ishes at x 0 = kn, k = 0 , ±1, ±2, . .., 

I <P' (* 0 ) I = I cosxq | = | cos/cji | = 1; therefore 

-f-OO -f-oo 

6 (sin x) = 2 &(x — kn) and j o|3 (x) X 

h= — oo — oo 


4-00 

6 (sin x) dx— ^ ^(fcn). 


16.3.1. (a) y' (x) = 1 — 6 (x — 1). (b) Since 
y ( + 0) = 1 1 y ( — 0) = 0, we have Ay = 1 at 
x = 0. For x =?= 0 we have y' (x) = 

e i/x 

"" l 2 ( l -|- g l / X )2 > and final] y y'(x)=S(x) — 
e l/x 


^2(l_|_gl/X)2- 

16.4.1. Multiply the right-hand side of 


(16.4.8) by siny 


and use the fact that 


cos kx sin ~- = y- j^sin j a; — sin | ) 



Chapter 17 

17.3.1. If T is the streamline, then by the 
definition of the function the tangent vector 

t ^ j to T (i.e. to the line 

if> = constant) coincides with the vector v of 
the flow rate which means that along the 


streamline of the liquid whose direction at 
each point is y = y(x, y)d ^ is equal to zero, 
(b) Let T be a small arc with the endpoints 
P = P(x , y ), Q = Q (x + dx, y + dy) and R = 
R(x - j-dx , y); consider the flow along the bro- 
ken line PQR assuming that the vector of the 
flow rate is v ( x , y) and prove that (accurate 
to infinitesimals of the first order) this flow 
equals dty — ty (x-{-dx, y-\-dy) — \|? (x, y). 

17.3.2 Use the fact that the tangent vectors 
to the lines cp = constant and oj) = constant 

at point M (x, y) are v<p = ( ~ j 

and V * = (“IF ’ IT) > ie ‘ V * = W> ~vx) 
and v^-, = (v x , v y ), and then take advantage 
of the condition of two vectors being perpen- 
dicular to each other. 17.3.4. (a) When a is 
real, the streamlines are straight lines parallel 
to the x axis; equipotential lines are parallel 
to the y axis; when a is pure imaginary, the 
flow is conjugate to that described, (b) When 
a is real, the streamlines are hyperbolas 
xy = constant; equipotential lines are hyper- 
bolas x 2 — y 2 = constant; the flow rate at point 
M is proportional to the distance OM. 
(c) When a is real, the streamlines are the 
circles tangent to the Ox axis at point 0\ 
equipotential lines are circles tangent to Oy 
axis at point 0. (d)For w—\uz the stream- 
lines are straight lines emanating from 
point O ; equipotential lines are circles with 
center at 0. 

17.4.1. Equating to zero (equilibrium) the 
sum of the projections of forces onto the 
direction of the y axis we get (for the case 

of small deviations, y <C l) 1 — k — — 

Xi 

k — — — z= 0. Whence 
l — x x 


y i 


<*i)=i / ( 




X 1 


(l-x i) 


Xx l — x ± 

I (t)l(b -A) 

X (l — x ± ) 


kl 


y(x, * x ) = 


kl 


, 0 ^ x ^ xi. 


/ l — x \ I t , k \ 
\ l — X\ 1 1 \ x x l — x 1 ) 
x 1 (l — x) 


kl 


x i ^ x ^ l. 


The function y (x, x t ) is called Green's junc- 
tion in the problem on the string. For an 
arbitrarily distributed force j(x) the deviation 
of the string is given by the formula y(x) = 
l 

J / (#i) y ( x , x t ) dx l . Note that Green’s func- 

o 

tion helped us find the solution to the func- 
tion y (x) when we did not even know the 
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•equation of its behavior ^this equation has 

the form »(0) = 0, y(l) = o) . 

17.4.2. The general solution to the equation 
witho ut th e force is x (^—Cx sinco^+6’ 2 coso)^ 
=y k/m, Cj and C 2 are arbitrary constants- 
Since the delta function differs from zero only 
at t = x, the solution to the equation with 
the delta-like force and the state of rest at 
t = — oo has the form 

0 

for — oo < £ <T, 

x (t) = (*) 

Ci sin cos at 

{ for t < t < + oo; 

the delta-like force imparts a unit impulse 
to the body, therefore after the action of 
the delta force the body at rest acquires an 

initial velocity z; 0 = — , the initial 


position remaining zero. The solution to the 
equation of oscillations with such initial 
conditions at t = x has the form 


x ( t , t) = 


{ 


0 for — oo< t <t, 

1 

sino)(£ — t) for t<£<4-oo. 

7TCG) v ' 1 


In other words, in formula (*) C x = - C0 .--^° x 

v 7 A mo) ’ 

_ sin cot 

C 2 — , 

mi o 

The solution to the problem with an 
arbitrary force / ( t ) is 

+ oo 

X(t)= j f (T)x(t, T )dX 
— 00 
t 

= | /(T) — Sin[0)(i — T)]dT. 
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